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Abstract 

Given a multivariate real (or complex) polynomial p and a domain V, we would like to decide 
whether an algorithm exists to evaluate p{x) accurately for all a; € P using rounded real (or 
complex) arithmetic. Here "accurately" means with relative error less than 1, i.e., with some 
correct leading digits. The answer depends on the model of rounded arithmetic: We assume that 
for any arithmetic operator op(a, b), for example a+b ot a-b, its computed value is op{a, b)-{l+S), 
where \S\ is bounded by some constant e where < e <C 1, but 6 is otherwise arbitrary. This 
model is the traditional one used to analyze the accuracy of floating point algorithms. 

Our ultimate goal is to establish a decision procedure that, for any p and 2?, either exhibits 
an accurate algorithm or proves that none exists. In contrast to the case where numbers are 
stored and manipulated as finite bit strings (e.g., as floating point numbers or rational numbers) 
we show that some polynomials p are impossible to evaluate accurately. The existence of an 
accurate algorithm will depend not just on p and V, but on which arithmetic operators are 
available (perhaps beyond +, — , and •), which constants are available to the algorithm (integers, 
algebraic numbers, ...), and whether branching is permitted in the algorithm. For floating point 
computation, our model can be used to identify which accurate operators beyond +, — and • 
(e.g., dot products, 3x3 determinants, ...) arc necessary to evaluate a particular p{x). 

Toward this goal, we present necessary conditions on p for it to be accurately evaluable on 
open real or complex domains V. We also give sufficient conditions, and describe progress toward 
a complete decision procedure. We do present a complete decision procedure for homogeneous 
polynomials p with integer coefficients, V = C", and using only the arithmetic operations +, — 
and •. 
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1 Introduction 

Let X = {xi, Xn) be a vector of real (or complex) numbers, let p{x) denote a multivariate 
polynomial, and let P be a subset of M" (or C"). We would ideally like to evaluate p{x) accurately 
for all x £ D, despite any rounding errors in arithmetic operations. The nature of the problem 
depends on how we measure accuracy, what kinds of rounding errors we consider, the class of 
polynomials p{x), the domain T>, and what operations and constants our evaluation algorithms 
may use. Depending on these choices, an accurate algorithm for evaluating p{x) may or may not 
exist. Our ultimate goal is a decision procedure that will either exhibit an algorithm that evaluates 
p{x) accurately for all x G D, or else exhibit a proof that no such algorithm exists. 

By accuracy, we mean that we compute an approximation Pcomp{x) to p{x) that has small 
relative error: \p{x) — Pcomp{x)\ < 'r]\p{x)\ for some desired < < 1. In particular, r] < 1 implies 
that p{x) = if and only if Pcomp{x) = 0. This requirement that p and Pcomp define the same 
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variety will be crucial in our development. We justify this definition of accuracy in more detail in 
Section 2. 

Our motivation for this work is two-fold. First, it is common for numerical analysts to seek 
accurate formulas for particularly important or common expressions. For example, in computational 
geometry and mesh generation, certain geometric predicates like "Is point x inside, on or outside 
circle C?" are expressed as multivariate polynomials p{-) whose signs determine the answer; the 
correctness of the algorithms depends critically on the correctness of the sign, which is in turn 
guaranteed by having a relative error less than 1 in the value of p(-) [^. We would like to 
automate the process of finding such formulas. 

The second motivation is based on recent work of Koev and one of the authors jJH] which 
identified several classes of structured matrices (e.g., Vandermonde, Cauchy, totally positive, certain 
discretized elliptic partial differential equations, ...) for which algorithms exist to accurately perform 
some (or all) computations from linear algebra: determinants, inversion, Gaussian elimination, 
computing singular values, computing eigenvalues, and so on. The proliferation of these classes of 
structured matrices led us to ask what common algebraic structure these matrix classes possess 
that made these accurate algorithms possible. This paper gives a partial answer to this question; 
see section ini 

Now we consider our model of rounded arithmetic. Let op{-) denote a basic arithmetic operation, 
for example op{x, y) = x + y oi op{x, y,z) = x + y ■ z. Then we assume that the rounded value of 
op{-), which we denote rnd{op{-)), satisfies 



where we call 5 the rounding error. We assume only that \5\ is tiny, \6\ < e, where < e < 1 and 
typically e <C 1; otherwise 5 is an arbitrary real (or complex) number. The constant e is called the 
machine precision, by analogy to floating point computation, since this model is the traditional one 
used to analyze the accuracy of floating point algorithms |35| I67j. 

To illustrate the obstacles to accurate evaluation that this model poses, consider evaluating 
p{x) = xi + X2 + x^ in the most straightforward way: add (and round) xi and X2, and then add 
(and round) X3. If we let 5i be the flrst rounding error and 62 be the second rounding error, we 
get the computed value Pcomp{x) = {{xi + X2)(l + 61) + 2;3)(1 + 52)- To see that this algorithm 
is not accurate, simply choose xi = X2 = 1 and X3 = —2 (so p{x) = 0) and 5i 7^ 0. Then 
Pcompi)-, —2) = 251(1 + 52) / 0, so the relative error is infinite. Indeed, it can be shown that there 
is an open set of x and of {5i,52) where the relative error is large, so that this loss of accuracy 
occurs on a "large" set. We will see that unless xi + X2 + X3 is itself a basic arithmetic operation, 
or unless the variety {xi + X2 + X3 = 0} is otherwise constructible from varieties derived from 
basic operations as described in Theorem 15.161 then no algorithm exists to evaluate xi + X2 + 3:3 
accurately for all arguments. 

In contrast, if we were to assume that the xi and coefficients of p were given as exact rational 
numbers (e.g., as fioating point numbers), then by performing integer arithmetic with sufficiently 
large integers it would clearly be a straightforward matter to evaluate any p{x) as an exact rational 
number. (One could also use fioating point arithmetic to accomplish this; see Sections 12. 31 and l2. 81 ) 
In other words, accurate evaluation is always possible, and the only question is cost. Our model 
addresses this by identifying which composite operations have to be provided with high precision 
in order to evaluate p{x) accurately. For further discussion of the challenge of evaluating a simple 
polynomial like xi + X2 + X3 accurately, see section 12.81 



rnd{op{-)) = op{-){l + 5) 
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We give some examples to illustrate our results. Consider the family of homogeneous polyno- 
mials 

Mjk{x) = 3 ■ xl + xl- xl- {j ■ xl+ j ■ xl-k ■ xl) 

where j and k are positive integers, T) = M", and we allow only addition, subtraction and multipli- 
cation of two arguments as basic arithmetic operations, along with comparisons and branching. 

• When k/ j < 3, Mjk{x) is positive definite, i.e., zero only at the origin and positive elsewhere. 
This will mean that Mjk{x) is easy to evaluate accurately using a simple method discussed 
in Section ini 

• When k/j > 3, then we will show that Mjj.{x) cannot be evaluated accurately by any al- 
gorithm using only addition, subtraction and multiplication of two arguments. This will 
follow from a simple necessary condition on the real variety V-^{Mjk), the set of real x where 
Mjk{x) = 0, see Theorem 14.81 This theorem requires the real variety V]^{p) (or the complex 
variety Vc{p)) to lie in a certain explicitly given finite set of varieties called allowable varieties 
in order to be able to evaluate p{x) accurately in real arithmetic (or in complex arithmetic, 
resp.). 

• When k/j = 3, i.e., on the boundary between the above two cases, Mjk{x) is a multiple of 
the Motzkin polynomial [HZI- Its real variety V^{Mjk) = {x : \xi\ = \x2\ = \xs\} satisfies the 
necessary condition of Theorem 14.81 and the simplest accurate algorithm to evaluate it that 
we know is shown (in part) below: 

if \xi — x-sl < \xi + X3I A \x2 — X3I < \x2 + x-sl then 

p= X^- [4((xi - Xsf + (X2 - X-sf + {Xi - X3){X2 - X3))] 

+xl ■ [2(2(xi - xsf + 5(x2 - X3){xi - X3f + 5(2:2 - xsfixi - X3) + 

2{X2-X3f)] 

+xl ■ [{Xi - Xs)^ + 8{X2 - X3){xi - X3f + 9{x2 - X3)^(xi - Xs)^ + 

8{X2 - X3f{xi - X3) + {X2 - Xs)^] 
+X3 ■ [2(X2 - X3)(xi - X3)((xi - X3f + 2(x2 - X3)(xi - Xs)^ + 

2(X2 - X3)2(xi - X3) + (X2 - X3)^)] 
+ {X2 - X3f{xi - X3)^((xi - X3)^ + (x2 - X3)^) 
p= j -p 

else ... 7 more analogous cases. 

In general, for a Motzkin polynomial in n real variables (n = 3 above), the algorithm has 2" 
separate cases. Just n tests and branches are needed to choose the correct case for any input 
X, so that the cost of running the algorithm is still just a polynomial function of n for any 
particular x. 

In contrast to the real case, when D = C" then Theorem 14.81 will show that Mjk{x) is not 
accurately evaluable using only addition, subtraction and multiplication. 

If we still want to evaluate Mjk{x) accurately in one of the cases where addition, subtraction and 
multiplication alone do not suffice, it is natural to ask which composite or "black-box" operations 
we would need to implement accurately to do so. Section [5] addresses this question. 
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The necessary condition for accurate evaluability of p{x) in Theorem 14.81 depends only on the 
variety of p{x). The next example shows that the variety alone is not enough to determine accurate 
evaluability, at least in the real case. Consider the two irreducible, homogeneous, degree 2d, real 
polynomials 

= (xf + xi'^) + (x? + xi)(g,(x3,...,x„))2 fori = l, 2 (2) 

where qi{-) is a homogeneous polynomial of degree d — 1. Both pi{x) and P2ix) have the same 
real variety Vm,{pi) = Vm.{p2) = {x : xi = X2 = 0}, which is allowable, i.e., satisfies the necessary 
condition for accurate evaluability in Theorem l4.8l However, near xi = X2 = 0, pi{x) is "dominated" 
by {xi+X2){qi{x3, so accurate evaluability ofpi{x) in turn depends on accurate evaluability 
oiqi{x3, ...,Xn)- Since qi{-) may be accurately evaluable while q2{-) is not, we see that VM.{pi) alone 
cannot determine whether Pi{x) is accurately evaluable. Applying the same principle to %(•), 
we see that any decision procedure must be recursive, expanding Pi{x) near the components of 
its variety and so on. We show current progress toward a decision procedure in Section 14.1^1 In 
particular, Theorem 14.441 shows that, at least for algorithms without branching, being able to 
compute dominant terms of p (suitably defined) accurately on is a necessary condition for 
computing p accurately on M". Furthermore, Theorem 14.461 shows that accurate evaluability of the 
dominant terms, along with branching, is sufficient to evaluate p accurately. 

In contrast to the real case. Theorem 14.211 shows that for the complex case, with D = C", and 
using only addition, subtraction and multiplication of two arguments, a homogeneous polynomial 
p{x) with integer coefficients is accurately evaluable if and only if it satisfies the necessary condition 
of Theorem 14.81 More concretely, p{x) is accurately evaluable for all x E C" if and only if can 
be completely factored into a product of factors of the form Xj , Xj + Xj and Xj — xj . 

The results described so far from Section ^ consider only addition, subtraction, multiplication 
and (exact) negation (which we call classical arithmetic). Section considers the same questions 
when accurate black-box operations beyond addition, subtraction and multiplication are permitted, 
such as fused-multiply-add [IHI, or indeed any collection of polynomials at all (e.g., dot products, 
3x3 determinants, ...). The necessary condition on the variety of p from Theorem l4.8l is generalized 
to black-boxes in Theorem 15.161 and the sufficient conditions Theorem 14.211 in the complex case 
are generalized in Theorems 15.271 and 15.281 

The rest of this paper is organized as follows. Section [21 discusses further details of our algorith- 
mic model, explains why it is a useful model of floating point computation, and otherwise justifies 
the choices we have made in this paper. Section |31 discusses the evaluation of positive polynomials. 
Section |1] discusses necessary conditions (for real and complex data) and sufficient conditions (for 
complex data) for accurate evaluability, when using only classical arithmetic. Section [4.31 describes 
progress toward devising a decision procedure for accurate evaluability in the real case using classi- 
cal arithmetic. Section [S] extends Section [^s necessary conditions to arbitrary black-box arithmetic 
operations, and gives sufficient conditions in the complex case. Section [HI describes implications for 
accurate linear algebra on structured matrices. 
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2 Models of Algorithms and Related Work 

Now we state more formally our decision question. We write the output of our algorithm as 
Pcompix, 6), where 6 = {6i, 62, ■■■Sk) is the vector of rounding errors made during the algorithm. 

Definition 2.1. We say that Pcompix,6) is an accurate algorithm for the evaluation of p{x) for 
X if 

V < < 1 ... for any rj = desired relative error 
3 < e < 1 ... there is an e = machine precision 
V X G P ... so that for all x in the domain 

V < e ... and for all rounding errors bounded by e 

\Pcompix,S) — p{x)\ < T] ■ \p{x)\ ... the relative error is at most rj. 

Our ultimate goal is a decision procedure (a "compiler") that takes p{-) and T> as input, and 
either produces an accurate algorithm Pcomp (including how to choose the machine precision e given 
the desired relative error rj) or exhibits a proof that none exists. 

To be more precise, we must say what our set of possible algorithms includes. The above 
decision question is apparently not Tarski-decidable jSSl 112] despite its appearance, because we see 
no way to express "there exists an algorithm" in that format. 

The basic decisions about algorithms that we make are as follows, with details given in the 
indicated sections: 

Sec. I2.lt We insist that the inputs x are given exactly, rather than approximately. 

Sec. 12. 2t We insist that the algorithm compute the exact value oip{x) in finitely many steps when 
all rounding errors 5 = 0. In particular, we exclude iterative algorithms which might produce 
an approximate value of p{x) even when 5 = 0. 

Sec. 12. 3t We describe the basic arithmetic operations we consider, beyond addition, subtraction 
and multiplication. We also describe the constants available to our algorithms. 

Sec. 12. 4t We consider algorithms both with and without comparisons and branching, since this 
choice may change the set of polynomials that we can accurately evaluate. 

Sec. 12.51 If the computed value of an operation depends only the values of its operands, i.e., if the 
same operands x and y of op{x, y) always yield the same 5 in rnd{op{x, y)) = op{x, y) ■ (1 + 5), 
then we call our model deterministic, else it is nondeterministic. We show that comparisons 
and branching let a nondeterministic machine simulate a deterministic one, and subsequently 
restrict our investigation to the easier nondeterministic model. 

Sec. 12. 6t What domains of evaluation T> do we consider? In principle, any semialgebraic set P is 
a possibility, but for simplicity we mostly consider open D, especially P = or V = C". 
We point out issues in extending results to other D. 

Finally, Section [2.71 summarizes the axioms our model satisfies, and Section [2.81 compares our 
model to other models of arithmetic, and explains the advantages of our model. 
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2.1 Exact or Rounded Inputs 

We must decide whether we assume that the arguments are given exactly [HI 1141 1551 162j or are 
known only approximately jl5l 1311 I^Tl I53j . Not knowing the input x exactly means that at best 
(i.e., in the absence of any further error) we could only hope to compute the exact value of for 
some X ~ X, an algorithmic property known as backward stability jl9ll35j . Since we insist that zero 
outputs be computed exactly in order to have bounded relative error, this means there is no way 
to guarantee that p{x) = when p{x) = 0, for nonconstant p. This is true even for simple addition 
xi + X2. So we insist on exact inputs in our model. 

2.2 Finite Convergence 

Do we consider algorithms that take a bounded amount of time for all inputs x £ D, and return 
PcompixjO) = p{x), i.e., the exact answer when all rounding errors are zero? Or do we consider 
possibly iterative algorithms that might take arbitrarily long on some inputs to produce an ade- 
quately accurate answer? We consider only the former, because (1) it seems natural to use a finite 
algorithm to evaluate a finite object like a polynomial, (2) we have seen no situations where an 
iterative algorithm offers any advantage to obtaining guaranteed relative accuracy and (3) this lets 
us write any algorithm as a piecewise polynomial function and so use tools from algebraic geometry. 

2.3 Basic Arithmetic Operations and Constants 

What are the basic arithmetic operations? For most of the paper we consider addition, subtraction 
and multiplication of two arguments, since this is necessary and sufficient for polynomial evaluation 
in the absence of rounding error. Furthermore, we consider negation as a basic operation that is 
always exact (since this mimics all implementations of rounded arithmetic) . Sometimes we will also 
use (rounded) multiplication by a constant op{x) = c - x. We also show how to extend our results 
to include additional basic arithmetic operations like op{x, y,z) = x + y ■ z. The motivations for 
considering such additional "black-box" operations are as follows: 

1. By considering operations like x + c, x — c and c • x for any c in a set C of constants, we may 
investigate how the the choice of C affects accurate evaluability. For example, if C includes 
the roots of a polynomial like p{x) = — 2, then we can accurately evaluate p{x) with the 
algorithm (x — \/2) ■ {x + v^), but otherwise it may be impossible. We note that having C 
include all algebraic numbers would in principle let us evaluate any univariate polynomial 
p{x) accurately by using its factored form p{x) = cY[i=i{x — ri). 

In the complex case, it is natural to consider multiplication by \/— 1 as an exact operation, 
since it only involves "swapping" and possibly negating the real and imaginary parts. We 
can accommodate this by introducing operations like x + \/— 1 • y and x — ^J—l ■ y. 

The necessary conditions in Theorem 15.161 and sufficient conditions in Theorems 15.271 and 
15.281 do not depend on how one chooses an operation x — rj from a possibly infinite set, 
just whether that operation exists in the set. On the other hand, a decision procedure 
must effectively choose that operation, so our decision procedures will restrict themselves to 
enumerable (typically finite!) sets of possible operations.^ 

^We could in principle deal with the set of all instructions x — r for r an arbitrary algebraic number, because the 
algebraic numbers are enumerable. 
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2. Many computers now supply operations like x+y-z in hardware with the accuracy we demand 
(the fused-multiply-add instruction [IHl)- It is natural to ask how this operation extends the 
class of polynomials that we can accurately evaluate. 

3. It is natural to build a library (in software or perhaps even hardware) containing several such 
accurate operations, and ask how much this extends the class of polynomials that can be 
evaluated accurately. This approach is taken in computational geometry, where the library 
of accurate operations is chosen to implement certain geometric predicates precisely (e.g., 
"is point X inside, outside or on circle C?" written as a polynomial whose sign determines 
the answer). These precise geometric predicates are critical to performing reliable mesh 
generation [KHj . 

4. A common technique for extending floating point precision is to simulate and manipulate 
extra precision numbers by representing a high precision number ?/ as a sum y = "^^=1 Vi of 
numbers satisfying ^ the idea being that each y^ represents (nearly) disjoint parts 
of the binary expansion of y (see [SlIHl 1161 1231 14U1 W7\ 1521 154j and the references therein; similar 
techniques were used by Gill as early as 1951). This technique can be modeled by the correct 
choice of black-box operations as we now illustrate. Suppose we include the enumerable set 
of black-box operations ^^^i Pi, where n is any finite number, and each pi is the product of 1 
to d arguments. In other words, we include the accurate evaluation of arbitrary multivariate 
polynomials in Z[x] of degree at most d among our black-box operations. Then the following 
sequence of operations produces as accurate an approximation of any such polynomial 



= ?/i + y2H hyfc 



as desired: 



1=1 



yi = rnd(^pi) = {I + 5i)C^pi) 

i=l i=l 
n n 

y2 = rndi^pi - yi) = {I + 62)C^Pi - yi) 

i=l i=l 
n k—1 n k—l 

yk = rnd(^pi-^yj) = {l + 6k){J2Pi-^yj) 

i=l 5=1 i=l 7=1 



Induction shows that 



so that y = X]j=i Vj approximates the desired quantity with relative error at most e'^. Despite 
this apparent power, our necessary conditions in Theorem 15 . 161 and Section will still show 
limits on what can be evaluated accurately. For example, no irreducible polynomial of degree 
> 3 can be accurately evaluable over if only dot products (degree d = 2) are available. 
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5. Another standard technique for extending floating point precision is to spht a floating point 
number x with h bits in its fraction into the exact sum x = Xhi + xio-, where Xhi and xio each 
have only 6/2 bits in their fractions. Then products like 

x-y = {xhi + xio) • ivhi + yio) = Xhi • Vhi + Xhi ■ yio + xio • Vhi + xio ■ yio 

can be represented exactly as a sum of 4 floating point numbers, since each product like 
Xhi ■ Vhi has at most b bits in its fraction and so can be computed without error. Arbitrary 
products may be computed accurately by applying this technique repeatedly, which is the 
basis of some extra-precise software floating point libraries like jSHl • The proof of accuracy of 
the algorithm for splitting x = Xhi + xio is intrinsically discrete, and depends on a sequence 
of classical operations some of which can be proven to be free of error j59| Thm 17], and 
similarly for the exactness of Xhi ■ Vhi- Therefore this exact multiplication operation cannot 
be built from simpler, inexact operations in our classical model. But we may still model this 
approach as follows: We imagine an exact multiplication operation x ■ y, and note that all 
we can do with it is feed it into the inputs of other operations. This means that from the 
operation rnd{z + w) we also get rnd{x ■ y + w) , rnd{x-y + r-s), rnd{x-y-z + w), and so on. In 
other words, we take the other operations in our model and from each create an enumerable 
set of other black-boxes, to which we can apply our necessary and sufficient conditions. 

2.4 Comparisons and Branching 

Are we permitted to do comparisons and then branch based on their results? Are comparisons exacts 
i.e., are the computed values oi x > y, x = y and x < y (true or false) always correct for real x and 
y? (For complex x and y we consider only the comparison x = y.) We consider algorithms both 
without comparisons (in which case Pcompix, S) is simply a polynomial), and with exact comparisons 
and branching (in which case Pcompix, S) is a piecewise polynomial, on semialgebraic sets determined 
by inequalities among other polynomials in x and 6). We conjecture that using comparisons and 
branching strictly enlarges the set of polynomials that we can evaluate accurately. 

We note that by comparing x — to zero for selected constants r^, we could extract part of the 
bit representation of x. Since we are limiting ourselves to a finite number of operations, we could 
at most approximate x this way, and as stated in Section f2. 11 this means we could not exploit this 
to get high relative accuracy near p{x) = 0. We note that the model of arithmetic in excludes 
real^integer conversion instructions. 

2.5 Nondeterminism 

As currently described, our model is nondeterministic, e.g., the rounded result of 1 + 1 is not 
necessarily identical if it is performed more than once. This is certainly different behavior than 
the deterministic computers whose behavior we are modeling. However, it turns out that this is 
not a limitation, because we can always simulate a deterministic machine with a nondeterministic 
one using comparisons and branching. The idea is simple: The first addition instruction (say) 
records its input arguments and computed sum in a list. Every subsequent addition instruction 
compares its arguments to the ones in the list (which it can do exactly), and either just uses the 
precomputed sum if it finds them, or else does the addition and appends the results to the list. In 
other words, the existence (or nonexistence) of an accurate algorithm in a model with comparisons 
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and branching does not depend on whether the machine is deterministic. So for simphcity, we will 
henceforth assume that our machines are nondeterministic. 

2.6 Choice of Domain V 

As mentioned in the introduction, it seems natural to consider any semialgebraic set as a possible 
domain of evaluation for p{x). While some choices, like V = {x : p{x) = 0} make evaluating p{x) 
trivial, they beg the question of how one would know whether x £ D. Similarly, if D includes a 
discrete set of points, then p{x) can be evaluated at these points by looking up the answers in a 
table. To avoid these pathologies, it may seem adequate restrict P to be a sufficiently "fat" set, 
say open. But this still leads to interesting complications; for example the algorithm 

Pcomp{x,6) = {{Xi + X2)(l + 5i) +X3)(1 + ^2) 

for p{x) = xi + X2 + X3, which is inaccurate on R", is accurate on the open set 
+ 3;2| > 21x3!}, whose closure intersects the variety Vm,{p) on {xi + X2 = A 2:3 = 0}. 
In this paper we will mostly deal with open T>, especially T> = or P = C", and comment on 
when our results apply to smaller T>. 

2.7 Summary of Arithmetic and Algorithmic Models 

We summarize the axioms our arithmetic and algorithms must satisfy. We start with the axioms 
all arithmetic operations and algorithms satisfy: 

Exact Inputs. Our algorithm will be given the input exactly. 

Finite Convergence. An accurate algorithm must, when all roundoff errors 5 = 0, compute the 
exact value of the polynomial in a finite number of steps. 

Roundoff Model. Except for negation, which is always exact, the rounded value of any arithmetic 
operation op(xi, ....,Xk) satisfies 

rnd{op{xi, ...,Xk)) = op{xi, ...,Xk) ■ {1 + S) 

where 6 is arbitrary number satisfying \5\ < e, where e is a nonzero value called the machine 
precision. If the data x is real (or complex), then 6 is also real (resp. complex). 

Nondeterminism. Every arithmetic operation produces an independent roundoff error 5, even if 
the arguments to different operations are identical. 

Domain D. Unless otherwise specified, the domain of evaluation T> is assumed to be all of R" (or 
ah of C"). 

We now list the alternative axioms our algorithms may satisfy. In each category, an algorithm 
must satisfy one set of axioms or the other. 

Branching or Not. Some of our algorithms will permit exact comparisons of intermediate quan- 
tities (<, = and > for real data), and subsequent branching based on the result of the 
comparison. Other algorithms will not permit branching. In the complex case, we will see 
that branching does not matter (see Sections 14.21 and 15.2(1 . 
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Classical or "black-box" operations. Some of our algorithms will use only "classical" arith- 
metic operations, namely addition, subtraction and multiplication. Others will use a set of 
arbitrary polynomial "black-box" operations, like op{x, y,z) = x + y ■ z or op{x) = x — -v/2, of 
our choice. In particular, we omit division. 

2.8 Other Models of Error and Arithmetic 

Our goal in this paper is to model rounded, finite precision computation, i.e., arithmetic with 
numbers represented in scientific notation, and rounded to their leading k digits, for some fixed k. 
It is natural to ask about models related to ours. 

First, we point out some positive attributes of our model: 

1. The model rnd{op{a,b)) = op{a,b){l + 5) has been the most widely used model for floating 
point error analysis |35j since the early papers of von Neumann [HE], Turing and Wilkinson 

inzi- 

2. The extension to include black-boxes includes widely used floating point techniques for ex- 
tending the precision. 

3. Though the model is for real (or complex) arithmetic, it can be efficiently simulated on a 
conventional Turing machine by using a simple variation of floating point numbers m ■ 2^, 
stored as the pair of integers (m, e), where m is of fixed length, and |e| grows as necessary. In 
particular, any sequence of n addition, subtraction, multiplication or division (by nonzero) 
operations can increase the largest exponent e by at most 0(n) bits, and so can be done 
in time polynomial in the input size. See |24j for further discussion. This is in contrast to 
repeated squaring in the BSS model which can lead to exponential time simulations. 

Models of arithmetic may be categorized according to several criteria (the references below are 
not exhaustive, but illustrative): 

• Are numbers (and any errors) represented discretely (e.g., as bit strings such as floating point 
numbers) [241 1351 1^ . or as a (real or complex) continuum |1U1 113j ? 

• Is arithmetic exact ^1|H] or rounded El ESI EH? If it is rounded, is the error bounded 
in a relative sense absolute sense ^H], or something else jllJElEI IHSl Sec. 2.9]? 

• In which of these metrics is the final error assessed? 

• Is the input data exact ^01 or considered "rounded" from its true value |12| 13 11 14H I53L I56j 

(and if rounded, again how is the error bounded)? 

• Do we want a "worst case" error analysis 1351 167j. or by modeling rounding errors as random 
variables, a statistical analysis [HSl |38j, £0j [SSI Sec. 2.8]? Does a condition number appear 
explicitly in the complexity of the problem jl5j ? 

First we consider floating point arithmetic itself, i.e., where real numbers are represented by a 
pair of integers (m, n) representing the real number m ■ r", where r is a fixed number called the 
radix (typically r = 2 or r = 10). Either by using one of many techniques in the literature for 
using an array (xi, x^) of floating point numbers to represent x = X]i=i to very high accuracy 
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and to perform arithmetic on such high precision numbers (e.g., [SlinilS])) or by converting m ■ r" 
to an exact rational number and performing exact rational arithmetic, one can clearly evaluate 
any polynomial p{x) without error, and the only question is cost. In light of this, our results on 
classical vs black-box arithmetic can be interpreted as saying when such high precision techniques 
are necessary, and which black-box operations must be implemented this way, in order to evaluate 
p accurately. 

Let us revisit the accurate evaluation of the simple polynomial 2/1 + ?/2 + ys- The obvious 
algorithm is to carry enough digits so that the sum is computed exactly, and then rounded at the 
end. But then to compute {2^ + 1) — 2^ accurately would require carrying at least e bits, which is 
exponential in the size of the input (log2 e bits to represent e). Instead, most practical algorithms 
rely on the technique in the above paragraph, repeatedly replacing partial sums like 1/1+^2 by X1+X2 
where » \x2\ and in fact the bits of xi and X2 do not "overlap." These techniques depend 
intrinsically on the discreteness of the number representation to prove that certain intermediate 
additions and subtractions are in fact exact. Our model treats this by modeling the entire operation 
as a black-box (see Section [T3|) . 

Second, consider our goal of guaranteed high relative accuracy. One might propose that absolute 
accuracy is a more tractable goal, i.e., guaranteeing \pcompix, 6) —p{x)\ < rj instead of \pcornp{x, 5) — 
p{x)\ < i]\p{x)\. However, we claim that as long as our basic arithmetic operations are defined to 
have bounded relative error e, then trying to attain relative error in Pcomp is the most natural goal. 

Indeed, we claim that tiny absolute accuracy is impossible to attain for any nonconstant poly- 
nomial p{x) when T) = M" or D = C". For example, consider p{x) = xi + X2, for which the 
obvious algorithm is Pcompix, 6) = {xi + 2:2) (1 + 5). Thus the absolute error \pcomp{x, 6) — pix)\ = 
\xi + X2\d < \xi + X2\€. This absolute error is at most rj precisely when l^i + X2I < ??/e, i.e., for x in 
a diagonal strip in the (xi, X2) plane. For p{x) = xi • X2 we analogously get accuracy only for x in a 
region bounded by hyperbolas. In other words, even for the simplest possible polynomials that take 
one operation to evaluate, they cannot be evaluated to high absolute accuracy on most of P = 
or C". The natural error model to consider when trying to attain low absolute error in p{x) is to 
have low absolute error in the basic arithmetic operations, and this is indeed the approach taken in 
|15j (though as stated before, repeated squaring can lead to an exponential growth in the number 
of bits a real number represents [Zj). 

One could also consider more complicated error models, for example mixed absolute/relative 
error, \pcompix, S) — p{x)\ < r] ■ max(|p(x)|, 1). Similar models have been used to model underflow 
error in floating point arithmetic A small mixed error implies that either the relative error 

or the absolute error must be small, and so may be easier to attain than either small absolute 
error or small relative error alone. But we argue that, at least for the class of homogeneous 
polynomials evaluated on homogeneous V, the question of whether p{x) is accurately evaluable 
yields the same answer whether we mean accuracy in the relative sense or mixed sense. To see 
why, note that x £ T> if and only if ax G D for any scalar a, since T> is homogeneous, and that 
p{ax) = a'^p{x), where d = degree{p). Thus for any nonzero p{x), scaling x to ax will make 
rj ■ max(|p(ax)|, 1) = r]\p{ax)\ once a is large enough, i.e., relative error 77 must be attained. By 
results in Section [4. .SI this will mean that Pcomp{x,S) must also be homogeneous in x of the same 
de gree, i.e., Pcomp{ctx,5) — oi'^Pcomp{x,S). Thus for any x G 2? at which we can evaluate p(x) with 
high mixed accuracy, we can choose a large enough so that 

OL''\pcomp{x,5) -p{x)\ = \pcomp{ax,5) - p{ax)\ < r] ■ max(|p(ax)|, 1) = r] ■ \p{ax)\ = a'^ ■ r] ■ \p{x)\ 

implying that p{ax) can be evaluated with high relative accuracy for all a. In summary, changing 
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our goal from relative accuracy to mixed relative/absolute accuracy will not change any of our 
results, for the case of homogeneous p and homogeneous P. 

Yet another model is to assume that the input x is given only approximately, instead of exactly 
as we assume. This corresponds to the approach taken in |311 1411 [53], in which one can imagine 
reading as many leading bits as desired of each input Xi from an infinite tape, after which one 
tries to compute the answer using a conventional Turing machine model. This gives yet different 
results, since, for example, the difference xi — X2 cannot be computed with small relative error in 
a bounded amount of time, since xi and X2 may agree in arbitrarily many leading digits. Absolute 
error is more appropriate for this model. 

It is worth commenting on why high accuracy of the sort we want is desirable in light of 
inevitable uncertainties in the inputs. Indeed, many numerical algorithms are successfully analyzed 
using backward error analysis |35M19j . where the computed results are shown to be the exact result 
for a slightly perturbed value of the input. This is the case, for example, for polynomial evaluation 
using Horner's rule where one shows that one gets the exact value of a polynomial at x but with 
slightly perturbed coefficients. Why is this not always accurate enough? 

We already mentioned mesh generation [SH] , where the inputs are approximately known physical 
coordinates of some physical object to be triangulated, but where geometric predicates about the 
vertices defining the triangulation must be answered consistently; this means the signs of certain 
polynomials must be computed exactly, which is in turn guaranteed by guaranteeing any relative 
accuracy 77 < 1. 

More generally, in many physical simulations, the parameters describing the physical system 
to be simulated are often known to only a few digits, if that many. Nonetheless, intermediate 
computations must be performed to much higher accuracy than the input data is known, for 
example to make sure the computed system conserves energy (which it should to high accuracy for 
the results to be meaningful, even if the initial conditions are uncertain). 

Another example where high accuracy is important are the trigonometric functions: When x 
is very large and slightly uncertain, the value of sin x may be completely uncertain. Still, we want 
the computed trigonometric functions to (nearly) satisfy identities like sin^ x + cos^ x = 1 and 
sin2x = 2 sin x cos x so that we can reason about program correctness. Many other examples of 
this sort can be found in articles posted at |37] . 

In the spirit of backward error analysis, one could consider the polynomial p fixed, but settle for 
accurately computing p{x) where x differs from x by only a small relative change in each component 
Xi. This is not guaranteed by Horner's rule, which is equivalent to changing the polynomial p slightly 
but not X. Would it be easier to compute p{x) accurately than p{x) itself? This is the case for 
some polynomials, like X1 + X2 + X3 or C1X2X3 + 02x^X3 + C3X1X2, where there is a unique Xi that we 
can associate with each monomial to "absorb" the rounding error from Horner's rule. In particular, 
with Horner's rule, the number of monomials in p{x) may at most be equal to the number of Xj. In 
analogy to this paper, one could ask for a decision procedure to identify polynomials that permit 
accurate evaluation of p{x) using any algorithm. This is a possible topic for future work. 

Another possibility is to consider error probabilistically j351 Sec. 2.8]. This has been imple- 
mented in a practical system |65j . where a program is automatically executed repeatedly with 
slightly different rounding errors made at each step in order to assess the distribution of the final 
error. This approach is criticized in |2H] for improperly modeling the discrete, non-random behav- 
ior of roundoff, and for possibly invalidating (near) identities like sin 2x = 2 sin x cos x upon which 
correctness may depend. 
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In smoothed analysis jSO], one considers complexity (or for us, relative error) by averaging over a 
Gaussian distribution around each input. For us, input could mean either the argument x of a fixed 
polynomial p, or the polynomial itself, or both. First consider the case of a fixed polynomial p with a 
randomly perturbed x. This case is analogous to the previous paragraph, because the inputs can be 
thought of as slightly perturbed before starting the algorithm. Indeed, one could imagine rounding 
the inputs slightly to nearby rational or floating point numbers, and then computing exactly. But 
in this case, it is easy to see that, at least for codimension 1 varieties of p, the "smoothed" relative 
error is finite or infinite precisely when the worst case relative error is finite or infinite. So smoothing 
does not change our basic analysis. ^ Now suppose one smooths over the polynomial p, i.e., over its 
coefficients. If we smooth using a Gaussian distribution, then as we will see, the genericity of "bad" 
p will make the smoothed relative error infinite for all polynomials. Changing the distribution from 
Gaussian to one with a small support would only distinguish between positive definite polynomials, 
the easy case discussed in section 13 and polynomials that are not positive definite. 

In interval arithmetic [491 15U1 0] one represents each number by a floating point interval guaran- 
teed to contain it. To do this one rounds interval endpoints "outward" to ensure that, for example, 
the sum c = a + b of two intervals yields an interval c guaranteed to contain the sum of any two 
numbers in a and b. It is intuitive that if an interval algorithm existed to evaluate p{x) for x ^T) 
that always computed an interval whose width was small compared to the number of smallest mag- 
nitude in the interval, and if the algorithm obeyed the rules in Section 12.71 then it would satisfy 
our accuracy requirements. Conversely, one might conjecture that an algorithm accurate by our 
criteria would straightforwardly provide an accurate interval algorithm, where one would simply 
replace all arithmetic operation by interval operations. The issue of interpreting comparisons and 
branches using possibly overlapping intervals makes this question interesting, and a possible subject 
for future work. 

Finally, many authors use condition numbers in their analysis of the complexity of solving 
certain problems. This is classical in numerical analysis more recent references are |12 ( 115 ( 113]. 
In this approach, one is willing to do more and more work to get an adequate answer as the condition 
number grows, perhaps without bound. Such a conditioning question appears in our approach, if 
we ask how small the machine precision e must be as a function of the desired relative error r/, 
as well as p, D, and allowed operations. Computing this condition number (outside the easy case 
described in Section ^ is an open question. 

3 Evaluating positive polynomials accurately 

Here we address the simpler case where the polynomial p{x) to be evaluated has no zeros in the 
domain of evaluation P. It turns out that we need more than this to guarantee accurate evaluability: 
we will require that \p{x)\ be bounded both above and below in an appropriate manner on V. 
We let T) denote the closure of T). 

Theorem 3.1. Let pcomp{x,6) be any algorithm for p{x) satisfying pcomp{x,0) = p{x), i.e. it 
computes the right value in the absence of rounding error. Ze^ := inf^g^ |p(x)| . Suppose T) is 
compact and Pmin > 0. Then Pcomp{x,5) is an accurate algorithm for p{x) on T>. 

^The logarithm of the relative error, like the logarithm of many condition numbers, does however have a finite 
average. 
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Proof. Since the relative error on V is \pcomp{x,6) - p(x)\/\p{x)\ < \pcomp{x,S) - p{x)\/pmin, it 
suffices to show that the numerator approproaches uniformly as 5 — > 0. This follows by writing 
the value oi pcompix,6) along any branch of the algorithm as Pcomp{x,6) = p{x) + X]a>oP"(^)'^"' 
where a > is a multiindex with at least one component exceeding 0. By compactness of P, 
I 'l2a>oP^(^)^°'\ — ^X^Q>o l*^!" some constant C, which goes to uniformly as the upper bound 
e on each \6i\ goes to zero. □ 

Next we consider domains D whose closure is not compact. To see that merely requiring 
Pmin > is not enough, consider evaluating p{x) = 1 + {xi + X2 + xs)^ on M'^. Intuitively, p{x) can 
only be accurate if its "dominant term" {xi + X2 + ^3)^ is accurate, once it is large enough, and 
this is not possible using only addition, subtraction and multiplication. (These observations will 
be formalized in Sections 14.31 and HI respectively.) 

Instead, we consider a homogeneous polynomial p{x) evaluated on a homogeneous T>, i.e. one 
where x € T> implies 'jx € T> for any scalar 7. Even though such T> are unbounded, homogeneity 
of p will let us consider just the behavior of p{x) on V intersected with the unit ball S"'~^ in M" 
(or 5^""^ in C"). On this intersection we can use the same compactness argument as above: 

Theorem 3.2. Let p{x) he a homogeneous polynomial, let V he a homogeneous domain, and let S 
denote the unit hall in M" (or C"j. Let 



Then p{x) can he evaluated accurately if Pmin.homo > 0. 

Proof. We describe an algorithm Pcomp{x,8) for evaluating p{x). There are many such algorithms, 
but we only describe a simple one. (Indeed, we will see that the set of all accurate algorithms 
for this situation can be characterized completely by Definition 14.261 and Lemma 14.271 ) Write 
p{x) = X^qCqx", where a is a multiindex (ai, «„), x":=x°^ •••x"", and Cq, 7^ is a scalar. 
Homogeneity implies \a\ = is constant. Then the algorithm simply 

1. computes each x'^ term by repeated multiplication by XjS, 

2. computes each Cq,x° either by multiplication by or by repeated addition if Cq, is an integer, 
and 

3. sums the Cqx" terms. 

Since each multiplication, addition and subtraction contributes a (l + 5j) term, it is easy to see that 



Pmin.homo • 



irif \p{x)\ 
x&vns 




a 



where each Aq, is the product of at most some number / of factors of the form 1 + 6i. 

Now let ||a;||2 = (X^j jxjp)^/^, so x = x/\\x\\2 is in the unit ball 5. Then the relative error may 
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be bounded by 
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which goes to zero uniformly in e. 



□ 



4 Classical arithmetic 

In this section we consider the simple or classical arithmetic over the real or complex fields, with 
the three basic operations {+, — , •}, to which we add negation. The model of arithmetic is governed 
by the laws in Section [2 .71 We remind the reader that this arithmetic model does not allow the use 
of constants. 

In Section 14.11 we find a necesary condition for accurate evaluability over either field, and in 
Section [4.21 we prove that this condition is also sufficient for the complex case. 

Throughout this section, we will make use of the following definition of allowability. 

Definition 4.1. Let p be a polynomial over M" or C", with variety V{p) ■.={x : p{x) = 0}. We 
call V{p) allowable if it can be represented as a union of intersections of sets of the form 



IfV{p) is not allowable, we call it unallowable. 

Remark 4.2. For a polynomial p , having an allowable variety V{p) is obviously a Tarski-decidable 
property (following I6'i^ ). since the number of unions of intersections of hyperplanes ©-© is finite. 

4.1 Necessity: real and complex 

All the statements and proofs in this section work equally well for both the real and the complex 
case, and thus we may treat them together. At the end of the section we use the necessity condition 
to obtain a partial result relating to domains. 

Definition 4.3. From now we will refer to the space of variables as S £ {M'",C"}. 

To state and prove the main result of this section, we need to introduce some additional notions 
and notation. 




■J ~ 



0} , 
0} . 



(3) 
(4) 
(5) 
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Definition 4.4. Given a polynomial p over S with unallowable variety V{p), consider all sets W 
that are finite intersections of allowable hyperplanes defined by Q, Q, CLnd subtract from 
V{p) those W for which W C V{p). We call the remaining subset of the variety points in general 
position and denote it by G{p) . 

Remark 4.5. IfV{p) is not allowable, then from definition \4-4\ it follows that G{p) 7^ 0. One 
may also think of points in G{p) as "unallowable" or "problematic", because, as we will see, we 
necessarily get large relative errors in their vicinity. 

Definition 4.6. Given x G S, define the set Allow(x) as the intersection of all allowable hyper- 
planes going through x: 

Allow(x) := (n^ez.^i) n (n^es^.^ij) n (n^eD,, Ai) > 

with the understanding that 

AlIow(2;):=5 whenever x ^ Zi, Sij, Dij for all i,j. 

Note that Allow(j;) is a linear subspace of S. 

We will be interested in the sets Allow(x) primarily when x £ G{p). For such cases we make 
the following observation. 

Remark 4.7. For each x G G{p), the set Allow(x) is not a subset ofV{p): 

Allow(x) 2 V{p), 

which follows directly from the definition ofG{p). 

We can now state the main result of this section, which is a necessity condition for the evalua- 
bility of polynomials over domains. 

Theorem 4.8. Letp be a polynomial over a domain D £ S. Let G{p) be the set of points in general 
position on the variety V{p). If there exists x £ V n G{p) such that Allow(x) n Int(P) / 0, then p 
is not accurately evaluable on T>. 

To prove Theorem 14.81 we need to recall the notion of Zariski topology (see, e.g., j36j^. 

Definition 4.9. A subset Y C M" (or C^) is called a Zariski closed set if there a subset T of the 
polynomial ring M[xi, . . . , Xn] (or C[xi, . . . , Xn]) such that Y is the variety ofT: Y = V{T) := 
npgr^(p)- A complement of a Zariski closed set is said to be Zariski open. The class of Zariski 
open sets defines the Zariski topology on S. 

In this paper, we consider the Zariski topology not on 5, but on a hypercube centered at the 
origin in (5-space (the space in which the vector of error variables 5 lies). This topology is defined 
in exactly the same fashion. 

Note that a Zariski closed set has measure zero unless it is defined by the zero polynomial only; 
then the set is the whole space. In the coming proof we will deal with nonempty Zariski open sets, 
which are all of full measure. Finally, it is worth noting that the Zariski sets we will work with are 
algorithm-dependent . 
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Finally, we represent any algorithm as in jlll ^ by a directed acyclic graph (DAG) with input 
nodes, branching nodes, and output nodes. For simplicity in dealing with negation (given that 
negation is an exact operation), we define a special type of edge which indicates that the value 
carried along the edge is negated. We call these special edges dotted, to distinguish them from the 
regular solid ones. 

Every computational node has two inputs (which may both come from a single other compu- 
tational node) ; depending on the source of these inputs we have computational nodes with inputs 
from two distinct nodes and computational nodes with inputs from the same node. The latter type 
correspond either to 

1. doubling {{x,x) A 2x), 

2. doubling and negating ((— x, —x) A —2x), 

3. computing zero exactly {{—x,x) ^i 0, {—x, —x) i— > 0, or {x,x) i— > 0), 

4. squaring ((x, x) i— > or {—x, —x) i— > x^), 

5. squaring and negating {{—x,x) i— > — x^). 

All nodes are labeled by {op{-),6i) with op(-) representing the operation that takes place at that 
node. It means that at each node, the algorithm takes in two inputs, executes the operation, and 
multiplies the result by (1 + 5j). 

Finally, for each branch, there is a single destination node, with one input and no output, whose 
input value is the result of the algorithm. 

Throughout the rest of this section, unless specified, we consider only non-branching algorithms. 

Definition 4.10. For a given x £ S, we say that a computational node N is of non-trivial type if 
its output is a nonzero polynomial in the variables 6 when the algorithm is run on the given x and 
with symbolic 6s. 

Definition 4.11. For a fixed x, let N be any non-trivial computational node in an algorithm. We 
denote by L{N) (resp., R{N)) the set of computational nodes in the left (resp., right) subgraphs of 
N . If both inputs come from the same node, i.e. L{N) and R{N) overlap, we will only talk about 
L{N). 

Definition 4.12. For a given e > 0, we denote by the hypercube of edge length 2e centered at 
the origin, in 6-space. 

We will need the following Proposition. 

Proposition 4.13. Given any algorithm, any e > 0, and a point x G G{p), there exists a Zariski 
open set A in such that no non-trivial computational node has a zero output on the input x for 
all 5 e A. 

Proof. The proof follows from the definition of the non-trivial computational node. 

Since every non-trivial computational node outputs a non-trivial polynomial in 5, it follows that 
each non-trivial computational node is nonzero on a Zariski open set (corresponding to the output 
polynomial in 5) in H^. Intersecting this finite number of Zariski open sets we obtain a Zariski 
open set which we denote by A; for any 6 £ A the output of any non-trivial computational node is 
nonzero. □ 
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We can now state and prove the following crucial lemma. 

Lemma 4.14. For a given algorithm, any x G G{p), and e > 0, exactly one of the following holds: 

1. there exists a Zariski open set A C ^ such that the value pcomp{x) computed by the algorithm 
is not zero when the algorithm is run with source input x and 5 E A; 

2- Pcompiy-, (^) = for all y £ Allow (x) and all 6 in H^. 

Proof of Lemma \4-14\ We recall that the algorithm can be represented as a DAG, as described in 
the paragraphs preceding Definition I4.1fll 

Fix a point x G G{p). Once x is fixed, the result of each computation is a polynomial expression 
in the 5s. Consider the Zariski open set A whose existence is guaranteed by ProDosition l4.13l There 
are now two possibilities: either the output node is of non-trivial type, in which case Pcomp{x, (^) 7^ 
for all (5 G A, or the output node is not of non-trivial type, in which case Pcomp{x, 60) = for some 
60 G A. 

In the latter case the output of the computation is zero; we trace back this zero to its origin, 
by marking in descending order all computational nodes that produced a zero (and thus we get a 
set of paths in the DAG, all of whose nodes produced exact zeros). Note that we are not interested 
in all nodes that produced a 0; only those which are on paths of zeros to the output node. 

We will examine the last occurrences of zeros on paths of marked vertices, i.e. the zeros that 
are farthest from the output on such paths. 

Lemma 4.15. The last zero on such a path must be either 

1. a source; 

2. the output of a node where {—x,x) A 0, (— x, —x) ^ 0, or 1-^ are performed; 

3. the output of an addition or subtraction node with two nonzero source inputs. 

Proof of Lemma \4-i5\ Note that a nonzero non-source output will be a non-constant polynomial 
in the 6 specific to that node. 

Clearly the last zero output cannot happen at a multiplication node; we have thus to show that 
the last occurrence of a zero output cannot happen at an addition or subtraction node which has 
two nonzero inputs from different nodes, at least one of which is a non-source. We prove the last 
statement by reductio ad absurdum. 

Assume we could have a zero output at a node N with two nonzero inputs, at least one of which 
is not a source. Let R{N) and L{N) be as in Definition KTT\ Let S{L{N)) and S{R{N)) be the 
sets of errors 6i corresponding to the left, respectively the right subtrees of N. 

By assumption, 6{R{N)) U 6{L{N)) ^ (since at least one of the two input nodes is a non- 
source). Let 61 {6r) denote the 5 associated to the left (right) input node of N. Then we claim that 
either 61 ^ 6{R{N)) or 6r ^ 5{L{N)). (There is also the possibility that one of the two input nodes 
is a source and does not have a 5, but in that case the argument in the next paragraph becomes 
trivial.) 

Indeed, since each 6 is specific to a node, if 5i were in 6{R{N)), there would be a path from the 
left input node to the right input node. Similarly, if 6r were in 5{L{N)), then there would be a path 
from the right input node of N to the left input node of N. So if both events were to happen at 
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the same time, there would be a cycle in the DAG. This cannot happen, hence either 6i ^ 6{R{N)) 
or 6r ^ S{L{N)). 

Assume w.l.o.g. 6i ^ 5{R{N)). Then the left input of iV is a non-trivial polynomial in 6i, while 
the right input does not depend on 6i at all. Hence their sum or difference is still a non-trivial 
polynomials in 5/. Contradiction. □ 

Now that Lemma [4.15l has been proven, we can state the crucial fact of the proof of Lemma [4.14l 
all last occurrences of a zero appear at nodes which either correspond to allowable constraints (i.e., 
zero sources, or sums and differences of sources), or are addition/subtraction nodes with both inputs 
from the same node, which always, on any source inputs, produce a zero. 

Take now any point y G Allow (x); then y produces the same chains of consecutive zeros con- 
structed (marked) in Lemma 14.141 as x does, with errors given by 6q G A. Indeed, any node on 
such a chain that has a zero output at x when the error variables are 5o can trace this zero back 
to an allowable constraint (which is satisfied by both x and y) or to an addition/subtraction node 
with both inputs from the same node; hence the node will also have a zero output at y with errors 
5q. In particular, Pcompix , Sq) = for 6o £ A, then PcompiVi^o) = 0. Moreover, changing 5o can 
only introduce additional zeros, but cannot eliminate zeros on the zero paths that we traced for x 
(by the choice of A). Therefore, Pcompiy,^) = for all y £ Allow(x) and 5 £ H^. This completes 
the proof of Lemma 14.141 □ 

From Lemma 14.141 we obtain the following corollary. 

Corollary 4.16. For any e > and any x £ G{p), exactly one of the following holds: the relative 
error of computation, \pcomp ~ pI/IpI; is either infinity at x for all 5 in a Zariski open set or 1 at 
all points y £ (Allow(a:;) \ V{p)) and all 5 £ H^. 

We now consider algorithms with or without branches. 

Theorem 4.17. Given a (branching or non-branching) algorithm with output function Pcompi'), 
X £ G{p), and e > 0, then one of the following is true: 

1. there exists a set Ai of positive measure in such that pcompi^o , 5) is nonzero whenever the 
algorithm is run with errors d £ Ai, or 

2. there exists a set A2 of positive measure in Hi, such that for every 5 £ A2, there exists a 
neighborhood Ns{x) of x such that for every y £ Ns{x) R (Allow(a;) \ V{p)), Pcomp{y,S) = 
when the algorithm is run with errors 6. 

Remark 4.18. This implies that, on a set of positive measure in H^, the relative accuracy of any 
given algorithm is either 00 or 1. 

Proof. With Pcompi') the output function and x a fixed point in general position, we keep the 6s 
symbolic. Depending on the results of the comparisons, the algorithm splits into a finite number of 
non-branching algorithms, which all start in the same way (with the input nodes) and then differ 
in accordance with a finite set of polynomial constraints on the ds and xs. 

Some of these branches will be chosen by sets of 6s of measure zero; at least one of the branches 
will have to be chosen by a set of 5s of positive measure whose interior is nonempty (all constraints 
being polynomials). Call that branch B, and let the set of 6s that choose it be called A^. 

By Proposition I4.13( there exists a Zariski open set A £ such that, for all 6 £ A, no 
non-trivial node in the subgraph representing our branch B has a zero output. In particular, this 
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includes all quantities computed for comparisons that define B. Let A2 :=lnt{AB n A), where Int 
denotes the interior of a set. By the choice of and A, the obtained set A2 is non-empty. 

Suppose the algorithm is run with errors 60 S A2 and Pcompix,6o) 7^ 0. Then, by continuity, 
there must be a neighborhood Ai in the set A2 on which the computation will still be directed to 
branch B and Pcom.p{x, ■) will still be nonzero, so we are in Case 1. 

Assume now that we are not in Case 1, i.e. there is no 5 G A2 such that Pcomp{x-,5) 7^ 0. In 
this case we show by contradiction that Pcompiv, S) = for all y £ Allow(x) if y is sufficiently close 
to X (since Allow(x) is a linear subspace containing x, there exist points in Allow(x) which are 
arbitrarily close to x), thus, that Case 2 must be fulfilled. 

If this claim is not true, then there is no neighborhood Ns{x) of x such that when y € Ns{x) n 
Allow(x), the algorithm is directed to branch B on 6. In that case, there must be a sequence 
{Un} £ (Allow(x) \ V(j>)) such that y-n ^ x and yn is always directed elsewhere for this choice of 6. 
The reason for this is that Allow(x) is a linear subspace which is not contained in V{p); hence no 
neighborhood of x in Allow(a;) can be contained in V{p), and then such a sequence yn must exist. 

Since there is a finite number of branches, we might as well assume that all y„ will be directed 
to the same branch B' for this 5 and that they split off at the same branching node (pigeonhole 
principle). 

Now consider the branching node where the splitting occurs, and let r{z, 5) be the quantity to 
be compared to at that node. Since we always go to B' with yn but to B with x, it follows that 
we necessarily must have r{yn,S) 7^ whereas r{x,6) = 0. On the other hand, until that splitting 
point the algorithm followed the same path with y.„ and with x, computing with the same errors 
6. Applying then case 2 of Lemma 14.141 (which can be read to state that any algorithm computing 
r, and obtaining r{x,S) = 0, will also obtain r{yn,5) = 0), we get a contradiction. 

This completes the proof of Theorem 14.171 □ 

Corollary 4.19. Letp be a polynomial overS with unallowable variety V{p). Choose any algorithm 
with output function Pcomp{'), o-ny point x G G{p), e > 0, and rj <1. Then there exists a set A^; of 
positive measure arbitrarily close to x and a set A of positive measure in H^, such that \pcomp—p\/\p\ 
is strictly larger than rj when computed at a point y £ using any vector of relative errors 5 £ A. 

Proof. On symbolic input x and with symbolic 6, the algorithm will have m branches Bi, . . . , Bm 
that correspond to constraints yielding (semi-algebraic) sets of positive measure *S*i, . . . , Sn% in 
(x, (5)-space. Choose x £ G{p), and let {x,0) be a point in (x, (5)-space. 

1. If (x, 0) is in Int(<S'i) (the interior of some region Si), then by Lemma 14.141 and Corollarv 14.161 
there exists either 

(a) a 5o in (5-space sufficiently small such that (x, Sq) is in Int(S'i) and Pcomp{x, Sq) 7^ 0. The 
relative error at (x,(5o) is in this case 00, and (by continuity) there must be a small ball 
around (x, 60) which is still in Int(5i), on which the minimum relative error is arbitrarily 
large, certainly larger than 1; 

(b) a Jo in (5-space sufficiently small and a y £ Allow(x) \ V{p) sufficiently close to x such 
that {y,So) is in Int(5'i) and Pcomp{y,So) = 0. In this case the relative error at {y,6o) is 
1, and (by continuity) there must be a small ball around (?/,5o) which is still in Int(S'j), 
on which the relative error is strictly larger than our r] < 1. 
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2. Otherwise, {x,0) must be on the boundary of some of the regions Sf, assume w.l.o.g. that 
it is on the boundary of the regions Si,. . . ,Si. In this case, we choose a small hyperdisk 
Bi{{x, 0)) in the linear subspace (x, •) such that B^{{x, 0)) intersects the closures of Si, . . . , 5/ 
(and no other Sis). We can do this because the sets Si are all semi-algebraic. 

(a) If there exists a 6o in (5-space such that {x, 6o) £ -Bg((x, 0)) and (x, 6o) G Int(S'j) for some 
i G {1, ...,/}, then by the same argument as in case ^ we obtain a small ball included 
in Int(5'j) on which the relative error is greater than rj; 

(b) Otherwise, if there exists a Sq such that {x, 6o) G B^{{x, 0)) is on the boundary of some 
region Si for which the local algorithm corresponding to it would yield Pcompix, 6o) ^ 0, 
then (by continuity) there exists a small ball around {x,5q) such that the intersection of 
that small ball with Si is of positive measure, and the relative error on that small ball 
as computed by the algorithm corresponding to Si is greater than 1; 

(c) Finally, otherwise, choose some point {x, 6i) £ B^{{x, 0)), so that (x, 6i) is on the bound- 
ary of a subset of regions S C {Si, . . . , Si}. We must have that Pcompix, 6i) = when 
computed using any of the algorithms that correspond to any Si G S. 

Let now B(x) be a small ball around x in x-space, and consider B{x) = i?(x)n (Allow (x)\ 
V{p)). 

There exists some y G B{x), close enough to x, such that (y, 6i) is either in the interior 
or on the boundary of some G 5. 

By Lemma 14.141 since we must necessarily have Pcompiv^^i) = as computed by the 
algorithm corresponding to Sk, if follows (by continuity) that there is a small ball around 
{y,5i) on which the relative error, when computed using the algorithm corresponding 
to Sk, is greater than rj. The intersection of that small ball with 5^ must have positive 
measure. 

Prom the above analysis, it follows that there is always a set of positive measure, arbitrarily 
close to (x, 0), on which the algorithm will produce a relative error larger than rj. □ 

Proof of Theorem \4.8\ Follows immediately from Theorem 14 . 1 71 and Corollarv 14.191 □ 

Remark 4.20. Consider the polynomial p{x, y) = {I — xy)^ + x^, whose variety is at infinity. We 
believe that Theorem \4.8[ can be extended to show that polynomials like p{x,y) cannot be evaluated 
accurately on M; this is future work. 

4.2 Sufficiency: the complex case 

Suppose we now restrict input values to be complex numbers and use the same algorithm types and 
the notion of accurate evaluability from the previous sections. By Theorem 14.81 for a polynomial p 
of n complex variables to be accurately evaluable over C" it is necessary that its variety V{p) :={z G 
C" : p{z) = 0} be allowable. 

The goal of this section is that this condition is also sufficient, as stated in the following theorem. 

Theorem 4.21. Let p : C" ^ C be a polynomial with integer coefficients and zero constant term. 
Then p is accurately evaluable onD = C"" if and only if the variety V{p) is allowable. 
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To prove this we first investigate what allowable complex varieties can look like. We start by 
recalling a basic fact about complex polynomial varieties, which can for example be deduced from 
Theorem 3.7.4 in |63l page 53]. Let V denote any complex variety. To say that dimc(l^) = k means 
that, for each z £ V and each 5 > 0, there exists w G VriB{z, 6) such that w has a T^-neighborhood 
that is homeomorphic to a real 2/c-dimensional ball. 

Theorem 4.22. Let p be a non-constant polynomial over . Then 

dimc(y(p)) = n - 1. 

Corollary 4.23. Let p : C" — > C 6e a nonconstant polynomial whose variety V{p) is allowable. 
Then V{p) is a union of allowable hyperplanes. 

Proof. Suppose V{p) = ^jSj, where each Sj is an intersection of the sets in Definition 14.11 and, 
for some jo, Sj^ is not a hyperplane but an irredundant intersection of hyperplanes. Let z € 
Sjf, \ Uj^jgSj. Then, for some 6 > 0, B{z,6) n V{p) C Sj^. Since dimc{Sj^^) < n — 1, no point in 
B{z, 6) n V{p) has a y(p)-neighborhood that is homeomorphic to a real 2(n — l)-dimensional ball. 
Contradiction. □ 

Corollary 4.24. Ifp : C" — > C is a polynomial whose variety V{p) is allowable, then it is a product 
p = cYljPj, where each pj is a power of Xi, {xi — xj), or {xi + Xj). 

Proof. By Corollarv l4.23[ the variety V{p) is a union of allowable hyperplanes. Choose a hyperplane 
H in that union, li H = Zj^ for some Jq, expand p into a Taylor series in xj^. If H = Di^j^ (or 
H = Si^j^) for some io, jo, expand p into a Taylor series in (xjq — Xjg) (or (x^p + Xj^)). In either 
case, in this expansion, the zeroth coefficient of p must be the zero polynomial in xj, j ^ jo (or 
j ^ {^0)jo})- Hence there is a A; such that p(x) = Xj^^p{x) in the first case, or p{x) = (xi^^Xj^j)'' p{x) 
in the second (third) one. In any case, we choose k maximal, so that the variety V{p} is the closure 
of the set V{p) \ Zj^ in the first case, or V{p) \ Di^j^ (y{p) \ S'iojo) the second (third) case. Then 
proceed by factoring p in the same fashion. □ 

Proof of Theorem \4.21\ By Corollarv 14.241 p = cflj Pj, with each pj a power of Xfc or (x^ ±xi). It 
also follows that c must be an integer since all coefiicients of p are integers. 

Since each of the factors is accurately evaluable, and we can get any integer constant c in front of 
p by repeated addition (followed, if need be, by negation), which are again accurate operations, the 
algorithm that forms their product and then adds/negates to obtain c evaluates p accurately. □ 

Remark 4.25. From Theorem \4.21l it follows that only homogeneous polynomials are accurately 
evaluable over C". 

4.3 Toward a necessary and sufficient condition in the real case 

In this section we show that accurate evaluability of a polynomial over M" is ultimately related to 
accurate evaluability of its "dominant terms" . This latter notion is formally defined later in this 
section. Informally, it describes the terms of the polynomial that dominate the remaining terms 
in a particular semialgebraic set close to a particular component of its variety; thus it depends on 
how we "approach" the variety of a polynomial. 

For reasons outlined in Section |21 we consider here only homogeneous polynomials. Futher- 
more, most of this section is devoted to non-branching algorithms, but we do need branching for 
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our statements at the end of the section. The reader will be alerted to any change in our basic 
assumptions. 

Here is a short walk through this section: 



• In Section f4. 3. IL Homogeneity, we discuss an expansion of the relative error \pcompix,S) — 
p{x)\/\p{x)\ as a function of x and 6, and prove a result about accurate evaluability of homo- 
geneous polynomials that will be used in Section [4. 3. 3L Pruning. 

• In Section I4.3.2L Dominance, we introduce the notion of dominance and present different 
ways of looking at an irreducible component of the variety V{p) using various simple linear 
changes of variables. These changes of variables allow us to identify all the dominant terms 
of the polynomial, together with the "slices" of space where they dominate. 

• In Section 14.3.31 Pruning, we explain how to "prune" an algorithm to manufacture an 
algorithm that evaluates one of its dominant terms, and prove a necessary condition for the 
accurate evaluation of a homogeneous polynomial by a non-branching algorithm. Roughly 
speaking, this condition says that accurate evaluation of the dominant terms we identified in 
Section f4. 3. 2L Dominance, is necessary. 

• In Section [4.3.41 Sufficiency of evaluating dominant terms, we identify a special collec- 
tion of dominant terms, together with the slices of space where they dominate. If accurately 
evaluable by (branching or non-branching) algorithms, these dominant terms allow us to 
construct a branching algorithm for the evaluation of the polynomial over the entire space 
(Theorem 14. 46|) . These are just some of the terms present in the statement of Theorem 14.441 

4.3.1 Homogeneity 

We begin by establishing some basic facts about non-branching algorithms that evaluate homoge- 
neous polynomials. 

Definition 4.26. We call an algorithm Pcompix, 5) with error set 6 for computing p{x) homogeneous 
of degree d if 

1. the final output is of degree d in x; 

2. no output of a computational node exceeds degree d in x; 

3. the output of every computational node is homogeneous in x. 

Lemma 4.27. If p{x) is a homogeneous polynomial of degree d and if a non-branching algorithm 
evaluates p{x) accurately by computing Pcompix,S), the algorithm must itself be homogeneous of 
degree d. 

Proof. First note that the output of the algorithm must be of degree at least d in x, since 
Pcomp{x, S) = p{x) when 5 = 0. Let us now write the overall relative error as 

, . .^ Pcompix, 6) -p{x) sr^Paix) a 

relerr[x, 6) = = > — -— 6 

P{x) ^ p{x) 

where a is a multi-index, li pcomp{x,6) is accurate then Pa{x)/p{x) must be a bounded rational 
function on the domain (M", or in the homogeneous case the sphere S^^~^^). This implies, in 
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particular, that the output cannot be of degree higher than d in x. So, Condition 1 of Definition l4.251 
must be satisfied. 

Now suppose Condition |21 of Definition 14.261 is violated. We would like to show that the final 
output is also inhomogeneous. We can assume without loss of generality that the algorithm does 
not contain nodes that do operations like x — x ot ■ x (these can be "pre-pruned" and replaced 
with a source). There exists a highest node {op{-),5i) whose output is not homogeneous. If it is 
the output node, we are done. Otherwise, look at the next node {op{-),6j) on the path toward the 
output node. The output of op{-,Sj) is homogeneous. On the other hand, the output of {op{-),Si) 
(which is one of the two inputs to (op(-), 6j)) must be inhomogeneous in x and must contain a term 
5ir{x) with r(x) an inhomogeneous polynomial in x. 

If op{-,5i) is the only input to op{-,6j), then inhomogeneity will be present in both outputs, 
since neither doubling nor squaring can cancel it; contradiction. Otherwise there is another input 
to op{-, 6j) (call it op{-, 6k))- The output of op{-, 6k) must therefore also be inhomogeneous to cancel 
the inhomogeneous r{x). Since the DAG is acyclic, 6i is not present in the output of op{-,6k) or 
6k is not present in the output op{-,6i). Without loss of generality, assume the former case. Then 
the term 6ir{x) will create inhomogeneity in the output of {op{-),6j), and hence {op{-),6i) is not a 
highest node with inhomogeneous output, contradiction. Hence Pcomp{x,6) is not homogeneous in 
X, thus one of the Pa{x)'s has to contain terms in x of higher or smaller degree than d. 

Similarly, if Condition l2l of Definition 14. 261 were violated, then for some 6s the final output would 
be a polynomial of higher degree in x, and that would also mean some Paix) would be of higher 
degree in x. 

In either of these cases, if some Pa{x) contained terms of smaller degree than d, by scaling the 
variables appropriately and letting some of them go to 0, we would deduce that pa{x) /p{x) could 
not be bounded. If some Pa{x) contained terms of higher degree than d, by scaling the variables 
appropriately and letting some of them go to oo, we would once again obtain that pa{x) / p{x) could 
not be bounded. □ 

This proof shows that an algorithm evaluates a homogeneous polynomial p accurately on M" if 
and only if each fraction Pa/p is bounded on M". It also shows each pa has to be homogeneous of 
the same degree as p. Therefore, each fraction Pa/p is bounded on M" if and only if it is bounded 
on the unit sphere S^"'~^h We record this as a corollary. 

Corollary 4.28. A non-branching homogeneous algorithm is accurate on M" if and only if it is 
accurate on 5*^""^^ 

4.3.2 Dominance 

Now we begin our description of "dominant terms" of a polynomial. Given a polynomial p with 
an allowable variety V{p), let us fix an irreducible component of V{p). Any such component is 
described by linear allowable constraints, which, after reordering variables, can be grouped into / 
groups as 

Xl = • • • = Xk^ = 0, Xki + l = • • • = ibXfcj, • • • , Xki_-^J^l = • • • = ibXfcj. 

To consider terms of p that "dominate" in a neighborhood of that component, we will change 
variables to map any component of a variety to a set of the form 

Xl = ■ ■ ■ = Xki = 0, = • • • = Xfc2 = 0, . . . , Xfc,_^+2 = ■ ■ ■ = Xki =0. (6) 
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The changes of variables we will use are defined inductively as follows. 

Definition 4.29. We call a change of variables associated with a set of the form 

dlXl = 0-2X2 = ••• = CTfcXfc, ai=±l, l = l,...,k, 

basic if it leaves one of the variables unchanged, which we will refer to as the representative of the 
group, and replaces the remaining variables by their sums (or differences) with the representative 
of the group. In other words, 

Xj := Xj, xi := xi — ajaixj for I ^ j, 

where Xj is the representative of the group xi, . . . , Xk- A change of variables associated with a set 
of all X satisfying conditions 

xi = • • • = Xfei = 0, 

(^ki+lXki+l = Crki+2Xki+2 = • • • = CTk^Xk^, 



(Tj = ±1 for all pertinent j 

is basic if it is a composition of the identity map on the first ki variables and (/ — I) basic changes of 
variables associated with each set ak^+iXk^+i = ■ ■ ■ = cTfcjiCfcj through (Tfc;_-^+iXfc,_^+i = • • • = ak^Xki- 
Finally, a change of variables associated with a set S of type is standard if it is a basic 
change of variables associated with some allowable irreducible superset S ^ S and it maps S to 

Thus, a standard change of variables amounts to splitting the group xi, . . . ,Xk^ into smaller 
groups and either keeping the conditions x,. = • • • = Xg = or assigning arbitrary signs to members 
of each group so as to obtain a set It may also involve splitting the chains 

of conditions ak^+iXk^+i = ••• = CTk^^-^Xk^^-^ into several subchains. The standard change of 
variables is then just one of the basic changes of variables associated with the obtained set. 

Example 4.30. There are 5x3 standard changes of variables associated with the set 













Xl = X2 




0, 


X3 = 


— X4 = 


= X5 : 










Xl 




Xl, 


X2 




X2, 


X3 




X3, 


X4 = 


X4 + X3, 


X5 




X5 - X3, 


or 


Xl 




Xl, 


X2 




X2, 


X3 




X3 + X4, 


X4 = 


X4, 


X5 




X5 + X4, 


or 


Xl 




Xl, 


X2 




X2, 


X3 




X3 - X5, 


X4 = 


X4 + X5, 


X5 




X5, 


or 


Xl 




Xl, 


X2 




X2 - Xl, 


X3 




X3, 


X4 = 


X4 + X3, 


X5 




X5 - X3, 


or 


Xl 




Xl, 


X2 




X2 - Xl, 


X3 




X3 + X4, 


X4 = 


X4, 


X5 




X5 + X4, 


or 


Xl 




Xl, 


X2 




X2 - Xl, 


X3 




X3 - X5, 


X4 = 


X4 + X5, 


X5 




X5, 


or 


Xl 




Xl, 


X2 




X2 + Xl, 


X3 




X3, 


X4 = 


X4 + X3, 


X5 




X5 - X3, 


or 


Xl 




Xl, 


X2 




X2+X1, 


X3 




X3 + X4, 


X4 = 


X4, 


X5 




X5 + X4, 


or 


Xl 




Xl, 


X2 




X2+X1, 


X3 




X3 - X5, 


X4 = 


X4 + X5, 


X5 




X5, 


or 


Xl 




Xl - X2, 


X2 




X2, 


X3 




X3, 


X4 = 


X4 + X3, 


X5 




X5 - X3, 


or 


Xl 




Xl - X2, 


X2 




X2, 


X3 




X3 + X4, 


X4 = 


X4, 


X5 




X5 + X4, 


or 


Xl 




Xl - X2, 


X2 




X2, 


X3 




X3 - X5, 


X4 = 


X4 + X5, 


X5 




X5, 


or 


Xl 




Xl + X2, 


X2 




X2, 


X3 




X3, 


X4 = 


X4 + X3, 


X5 




X5 - X3, 


or 


Xl 




Xl + X2, 


X2 




X2, 


X3 




X3 + X4, 


X4 = 


X4, 


X5 




X5 + X4, 


or 


Xl 




Xl + X2, 


X2 




X2, 


X3 




X3 - X5, 


X4 = 


X4 + X5, 


X5 




X5, 





26 



The supresets S for this example are the set S itself together with the set {x : xi + X2 = 0, x^ = 
—X4 = X5} and the set {x : xi — X2 = 0, X3 = — X4 = X5}. 

Note that we can write the vector of new variables x as Cx where C is a matrix, so can label 
the change of variables by the matrix C. 

Now let us consider components of the variety V{p)- We have seen that any given component 
of V{p) can be put into the form xi = X2 = ... = Xk = using a standard change of variables, 
provided V{p) is allowable. (To avoid cumbersome notation, we renumber all the variables set to 
zero as xi through x^ for our discussion that follows. We will return to the original description to 
introduce the notion of pruning.) 

Write the polynomial p{x) in the form 

aga 

where, almost following MATLAB notation, we write xji-^] :=(xi, . . . , x^), x^^^i-n] ■=ixk+i, ■ ■ ■ , a^n). 
Also, we let A be the set of all multi-indices A :=(Ai, . . . , A^) occuring in the monomials of p{x). 

To determine all dominant terms associated with the component xi = X2 = ■■■ = Xk = 0, 
consider the Newton polytope P of the polynomial p with respect to the variables xi through x^ 
only, i.e., the convex hull of the exponent vectors A G A (see, e.g., |46| p. 71]). Next, consider the 
normal fan N{P) of P (see |681 pp. 192-193]) consisting of the cones of all row vectors ij from the 
dual space (R'^)* whose dot products with x G P are maximal for x on a fixed face of P. That 
means that for every nonempty face F of P we take 

k 

Np :={r] = {rii, . . . , n^) S (M'^)* : F Q {x €z P : rix{:=y^ ^j^j) — max?7y}} 

and 

N{P) ■.={Nf : F is a face of P}. 

Finally, consider the intersection of the negative of the normal fan —N{P) and the nonnegative 
quadrant (M'^)^. This splits the first quadrant (M*^)^ into several regions Saj according to which 
subsets Aj of exponents A "dominate" close to the considered component of the variety V{p), in 
the following sense: 

Definition 4.31. Let Aj be a subset of A that determines a face of the Newton polytope P of p such 
that the negative of its normal cone —N{P) intersects [Mf')'^ nontrivially (not only at the origin). 
Define Sa- G (M'^)^ to be the set of all nonnegative row vectors i] such that 

rjXi = rj\2 < r]X, VAi, A2 G Aj, and A G A \ Aj. 

Note that if xi through xj. are small, then the exponential change of variables Xj 1— > — log |xj| 
gives rise to a correspondence between the nonnegative part of —N{P) and the space of original 
variables xj^.^]. We map back the sets Saj into a neighborhood of in M'^ by lifting.^ 

Definition 4.32. Let F^^ C [—1, 1]'^ be the set of all points x^i-^ G M*^ such that 

7?: = (-log|xi|,...,-log|xfc|) G Sa^. 

^This is reminiscent of the concept of an amoeba introduced in 1331 . 
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Remark 4.33. For any j, the closure of F\. contains the origin iriR^. 

Remark 4.34. Given a point x^i.j.-^ S -^A^; o,nd given rj = (ni, n2, . . . , n^) E S/\_j, for any t S 
(0,1), the vector (xit"-^ Xkt^'') is in Fp^^ Indeed, i/ (— log . . . , — log |xfc|) G 5a^, then so 
is (— log l^il, . . . , — log — log|t|r/, since all equalities and inequalities that define S/\_. will be 
preserved, the latter because log \ t\ < 0. 

Example 4.35. Consider the following polynomial 

p{xi, X2, X3) = X2X^'^ + XiX2x\^ + xfx^'^ + xfx2^ + xJ''a;2X3. 

We show below the Newton polytope P of p with respect to the variables xi, X2, its normal fan 
N{P), the intersection —N{P) n R^, the regions Saj, and the regions F\.. 




Figure 1. A Newton polytope P. Figure 2. Its normal fan N{P). 




Figures. The intersection —A^(P) n M^. Figure 4. The regions • 
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Figure 5. The regions F\ .. 



Definition 4.36. We define the dominant term of p{x) corresponding to the component xi 
Xk = and the region Fa by 



Pdom,{x):= ^ Cxx\^.^qx{x^k+l:n] 



AeA, 



The following observations about dominant terms are immediate. 
Lemma 4.37. Let rj = (ni, . . . , n^) G S\. and let dj := "^XieA ^i^i- fixed and let 



x{t): = {xi{t),...,Xn{t)), Xj{t):=\ q ^ 



, . . . , 



X- 



j = k + l,...,n. 

Then pdomj{x{t)) has degree dj in t and is the lowest degree term of p{x{t)) in t, that is 

p{x{t)) = pdorrij ix{t)) + o{t'^J) as t ^ 0, deg^pdomj ix{t)) = dj. 
Proof. Follows directly from the definition of a dominant term. 

Corollary 4.38. Under the assumptions of Lemma \4.'^T\ suppose that pdomj{x^) / 0. Then 

PdomAx{t)) 



lim , , 
t^o p{x{t)) 



1. 



□ 
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Thus Pdoruj is the leading term along each curve traced by x{t) as t tends to zero from above. 
An important question now is whether the dominant term pdom ■ indeed dominates the remaining 
terms of p in the region F\. in the sense that Pdomj{x) /p{x) is close to 1 sufficiently close to 
the component xi = ■ ■ • = Xfc = of the variety V{p). This requires, at a minimum, that the 
variety V{pdomj) does not have a component strictly larger than the set xi = • • • = = 0. Note 
that most dominant terms of a polynomial actually fail this requirement. Indeed, most dominant 
terms of p are monomials, which correspond to regions F\. indexed by singletons Aj, hence to the 
vertices of the Newton polytope of p. The dominant terms corresponding to larger sets Aj are more 
useful, since they pick up terms relevant not only in the region F^^ . but also in its neighborhood. In 
Example 14 . 351 above the dominant terms for ^^{(2.2), (8,0)} ™d -^{(2, 2), (0,8)}) corresponding to the edges 
of the Newton polygon, are the useful ones. This points to the fact that we should be ultimately 
interested only in dominant terms corresponding to the facets, i.e., the highest-dimensional faces, 
of the Newton polytope of p. Note that the convex hull of Aj is a facet of the Newton polytope 
if and only if the set S\. is a one-dimensional ray. 

The next lemma will be instrumental for our results in Section 14.3.41 It shows that each 
dominant term Pdomj such that the convex hull of Aj is a facet of the Newton polytope of p and 
whose variety V{pdomj ) does not have a component strictly larger than the set xi = ■ ■ ■ = = Q 
indeed dominates the remaining terms in p in a certain "slice" -Fa^ around Fp^. . 

Lemma 4.39. Let Pdonij be the dominant term of a homogeneous polynomial p corresponding to 
the component xi = ■ ■ ■ = Xk = of the variety V{p) and to the set Aj whose convex hull is a facet 
of the Newton polytope N . 

Let be any closed pointed cone in (M*"')^ with vertex at that does not intersect other one- 
dimensional rays S\^, I 7^ j, and contains S/\_. \ {0} in its interior. Let F\j be the closure of the 
set 

{x[i:k] e [-1,1]^' : (-log|xi|,...,-log|2;fc|) G 5aJ. (9) 

Suppose the variety V{pdomj) of pdomj "is allowable and intersects F\. only at 0. Let \\ ■ \\ be any 
norm. Then, for any 6 = 5{j) > 0, there exists e = e{j) > such that 

ll-^fi-felll ~ 
< 6 whenever ■ < e and xji-^j G F\^. (10) 

lF[fc+l:n] II 

Proof. We prove the lemma in the case F\. does not intersect nontrivially any of the coordinate 
planes (the proof extends to the other case via limiting arguments). Let ||x[fc+i.„]|| = 1 and let xi 
through X]^ be ±1. If 77 = (ni, . . . ,71^) G then, directly from the definition of the set F\., the 
curve (t^^xi, . . . , t"''=3;fc), t G (0, 1], lies in F^. (and every point in F^. lies on such a curve). Denote 
(t"^xi, . . . , t'^'^Xf^, X[fc_|_i.„]) by x(t) and let t decrease from 1 to 0, keeping the Xm, m = 1, . . . , n, 
fixed. By the assumption of the Lemma, Pdomj{x{t)) does not vanish for sufficiently small t > 0. 
Moreover, by Lemma [4.371 Pdom is the leading term of p in F\^. Since the cone S^^ around S\. 
does not intersect any other one-dimensional rays S'a;, I i= all the monomials present in any 
term that dominates in Ff^. \ F\. are already present in Pdomj ■ Thus Pdomj contains all terms 
that dominate in F\j. Therefore, there exists £{x) > such that \pdomj 

{xit))/p{x{t)) -l\ < 6 

whenever t < £{x). The function / : x — > e{x) is lower semicontinuous. Since the set S ■.={x : Xm = 
±1, m = l,...,k, ||x[fc+i.„]|| = 1} is compact, the minimum e: = min/(S') is necessarily positive 
and satisfies H10() . □ 



Pdonij [X[l:k] , ^^[fc+lin] ) 



P{x\l:k]^X\k+l:n]) 
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The above discussion of dominance was based on the transformation of a given irreducible 
component of the variety to the form xi = • • • = = 0. We must reiterate that the identification 
of dominant terms becomes possible only after a suitable change of variables C is used to put a 
given irreducible component into the standard form xi = ■ ■ ■ = Xk = and then the sets Aj are 
determined. Note however that the polynomial Pdomj is given in terms of the original variables, i.e., 
as a sum of monomials in the original variables Xq and sums /differences XgzizXr- We will therefore 
use the more precise notation Pdomj,c i^i the sequel. 

Without loss of generality we can assume that any standard change of variables has the form 

X = {x[i:ki],X[k^+i.,k2]^- ■ ■ ,X[kt_-,+i:ki]) ^x = , X[fc,+i:fc2] , . . • , ) , where 

^km+l-=^km+l, ^k^+2-=Xk^+2 - Crkm+2Xk^+l, X k^^^ := X k^+^ - <T k^+^X k^+^ , (11) 

ko:=0, ar = H for all pertinent r 

Note also that we can think of the vectors r] G S\j as being indexed by integers 1 through ki, i.e., 
r] = {ill, . . . , n^i). Moreover, to define pruning in the next subsection we will assume that 

f^km,+i ^ for all r = km + 2, . . . , km+i and for all m = 0, . . . , ^ — 1. (12) 

Remark 4.40. This condition is trivially satisfied if = 0, as is the case for any group 

^km+i = (^km+2Xk,n+2 = ••• = Xfc^^^^ j of original conditions that define the given irreducible 

component ofV{p), since Xk^+i does not have to he close to in the neighborhood of that component 
ofV{p). If, however, the same group of equalities was created from the original conditions Xk^+i = 
Xkm+2 = • • • = Xk^_f_i = due to the particular change of variables C, the condition Ui!\) is 
no longer forced upon us. Yet can he assumed without loss of generality. Indeed, if, say, 
iT'km+2 < ^fcm+i' then we can always switch to another standard change of variables by taking 
^km+2 to be the representative of the group Xk^+i, ■ ■ ■ ,xi^^^_^^ and taking the sums/differences with 
Xkm+2 <is the other new variables. Also note that il^) is satisfied either by all or by no vectors in 
. In other words, is a property of the entire set S\. . So, with a slight abuse of terminology 
we will say that a set Sa- satisfies or fails hl^). 

Finally note that the curves {x{t)) corresponding to the change of variables (|llj) are described 
as follows: 

x{t) : = (x[i:fc^] (t), a::[fci+i:fc2] (*),•••, x^ki^^+i:ki]{t),x^ki+i:n])^ ^hexe 

^[fem+l:A:m+i] (^) 

(r'=-+^Xfe^+l,t"'=-+2Xfc„+2 + afc„+2t"'=-+^Xfc^+l, . . . + (yk^+,t^''^+^Xk^+l) 

where /co:=l, m = 0, 
This description will be instrumental in our discussion of pruning, which follows immediately. 

4.3.3 Pruning 

Now we discuss how to convert an accurate algorithm that evaluates a polynomial p into an accurate 
algorithm that evaluates a selected dominant term pdomj,c- This process, which we will refer to as 
pruning, will consist of deleting some vertices and edges and redirecting certain other edges in the 
DAG that represents the algorithm. 
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Definition 4.41 (Pruning). Given a non-branching algorithm represented by a DAG for com- 
puting Pcompix,S), a standard change of variables C of the form and a subset Aj G A satis- 
fying hll3\) . we choose any rj G ^a^, we input (formally) the expression Ucl\) . and then perform the 
following process. 

We can perform one of two actions: redirection or deletion. 5?/ redirection (of sources) we mean 
replacing an edge from a source node corresponding to a variable xj to a computational node i by an 
edge from the representative Xrep of xj followed by exact negation if crj = —1. This corresponds to 
replacing Xj by the product ajXrep- To define deletion, consider a node i with distinct input nodes 
j and k. Then deletion of node i from node j means deleting the out-edge to node i from node j , 
changing the origin of all out-edges from node i to input node k, and deleting node i. 

Starting at the sources, we process each node as follows, provided that both its inputs have 
already been processed (this can be done because of acyclicity). Let the node being processed be node 
i, i.e., {op{-),6i), and assume it has input nodes k and I. Both inputs being polynomials in t, we 
determine the lowest degree terms in t present in either of them and denote these degrees by deg(fc) 
and deg(/). 

if op( ) = • and one or both inputs are sources, then 

redirect each source input. 
if op(-) = lb, then 

if deg(A;) / deg(Z), say deg(A;) > deg(Z), delete input node i from node k. 

else If nodes k and I are sources and the operation op{-) leads to cancellation of their lowest 
degree terms in t, examine their second-lowest degree terms. If those degrees coincide or 
if one second-lowest term is missing, we change nothing. If one is bigger than the other, 
we do not change the source containing the lower degree term in t, but redirect the other 
source. 

If only one of nodes k and I is a source or if both inputs are sources but there is no 
cancellation of lowest degree terms, redirect each source. 

We then delete inductively all nodes which no longer are on any path to the output. 

We call this process pruning, and denote the output of the pruned algorithm by pdomj,c,com.p{x, S). 

Remark 4.42. Note that the outcome of pruning does not depend on the choice of t] £ S/\_. . 
Since each region S\. is determined by linear homogeneous equalities and inequalities with integer 
coefficients, the vector rj can always be chosen to have all integer entries. 

Example 4.43. Figure 6 shows an example of pruning an algorithm that evaluates the polynomial 

xjxl + (X2 - X3)^ + {X3 - Xifxl 

using the substitution 

{tXi,X2,tX3 + X2,tX4 + X2, X5) 

near the component 

X\ =0, X2 = X3 = X4. 
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Figure 6. Pruning an algorithm for p{x) = x\x'^ + (x2 — a^s)^ + (2:3 — 2;4)^x| 



The result of pruning is an algorithm that evaluates the dominant term 

22,/ \2 2 

X1X2 + [XS - X4) Xg. 

One of two branches leading to the node A is pruned due to the fact that it computes a quantity of 
order O(t^) whereas the other branch produces a quantity of order 0{t^). 
The output of the original algorithm is given by 

+ Si)xl{l + ,52)(1 + 53) + {X2 - xs)\l + 6^)\l + 65f{^ + Se)) (1 + ^7) 

+ {X3 - X4)'(l + ds)H^ + S9)xl{l + <5lo)(l + Sii)) (1 + 612). 

The output of the pruned algorithm is 

+ 6i)xl{l + 62){l + 63) + (X3 - x^fil + Ssfil + 6^)xl{l + <5io)(l + ^n)) (1 + 612). 
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Let us prove that this process will indeed produce an algorithm that accurately evaluates the 
corresponding dominant term. 

Theorem 4.44. Suppose a non-branching algorithm evaluates a polynomial p accurately on R" 
by computing Pcomp{x,S). Suppose C is a standard change of variables Ul\) associated with an 
irreducible component ofV{p). Letpdom ,c be one of the corresponding dominant terms of p and let 
satisfy Then the pruned algorithm defined in Definition \4 .41\ with output pdomj ,c,com.pix , ^) 

evaluates Pdomj,c accurately on M". In other words, being able to compute all such Pdomj,c for all 
components of the variety V{p) and all standard changes of variables C accurately is a condition 
necessary to compute p accurately. 

Proof. Directly from the definition of pruning it can be seen that the output of each computational 
node is a homogeneous polynomial in t. This can be checked inductively starting from computa- 
tional nodes operating on two sources, using the pruning rules. Moreover, the pruning rules are 
equivalent to taking the lowest degree terms in t (as well as setting some 5s to zero). This can be 
checked inductively as well, once we rule out the situation when a ± node in the original algorithm 
leads to exact cancellation of lowest degree terms of the inputs, and at least one of the inputs is not 
a source. Indeed, in that case one of the inputs contains a factor (1 + 5) and that 6 by acyclicity is 
not present in the other input. Therefore no exact cancellation of lowest degree terms can occur. 

Thus, the final output Pdomj,c,compix,6) of the pruned algorithm takes the lowest degree terms 
in t of the final output of the original algorithm, so Pdom ,c,comp{x, 6) is homogeneous in t (of degree 
dj = rjX = ^^iUi). We write 

Pcomp{x,5) -p{x) _ -y^ Pa{x) Pdomj,C,cx,mp{x) -Pdomj,c{x) _ -y^ Pa,domj,c{x) 

P{x) ^ p{x) ' Pdomj,c{x) ^ Pdomj,c{x) 

Note that, in eliminating nodes and redirecting the out-edges in the pruning process, we do the 
equivalent of setting some of the 5s to 0, and hence the set of monomials 5°^ present in the second 
sum is a subset of the set of monomials 5" present in the first sum. 

Indeed, first of all, we can focus on the effect of deleting some nodes, since redirection does 
not affect 6s at all, because only sources can be redirected. So, if 6" appears in the second sum, 
there is a path which yields the corresponding multi- index, on which some term in Pa,domj,c{x) is 
computed. But since this path survived the pruning, there is a corresponding path in the original 
DAG which perhaps has a few more nodes that have been deleted in the pruning process and a 
few source nodes that were redirected. In the computation, the effect of deleting a node was to set 
that 5 equal to (and make some terms of higher degree disappear). So the surviving term was 
also present in the computation pcomp{x,5), with the same multi-index: just choose the 1 in the 
(1 + 6) each time when you hit a node that will be deleted (i.e., whose 6 will be set to 0). 

Now note that Pa,domj,cix) is the leading term of Paix), i.e., the term of smallest degree dj in 
t. This happens since each term of degree dj in Paix) must survive on the same path in the DAG, 
with the same choices of 1 in (1 + 5) each time we hit a deleted node. 

We can now prove that Pdorrij,c,comp{x) is accurate. To do that, it is enough to show that each 
Pa,domj,c{x)/Pdomj,cix) is bounded, provided that there is some constant M such that \paix)/p{x)\ < 
M for ah X. 

Choose a point x = (xi, . . . , x„) not on the variety of Pdom ,c and consider the curve traced by 
the associated point x{t) from as t tends to 0. Since both Pa,domj,c{x{t)) and Pdamj,c{x{t)) 
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are homogeneous of degree dj in t, we have 



Pa,dom,,cix) 



Pdomi,c{x) 



Pa,domj,c{x{t)) 



Pdomi,c{xit)) 



hm 



Pa,dom,,cix{t)) 



Pdom,,c{x{t)) 



Since both Pa4om-,c{x{t)) and Pdom. ,c{x{t)) are the dominant terms in Pa{x), respectively p{x) 
along the curve {x{t) : t — > 0}, we conclude that 



Pa,domj,c{x) 


= lim 


Pdom^,c{x) 





Pa,do7n,,c{x{t)) 



Pdom,,c{x{t)) 



= lim 


Pa{x{t)) 




Pixit)) 



< M . 



Invoking the density of the Zariski open set {x : Pdom ,c{x) / 0}, we are done. 



□ 



4.3.4 Sufficiency of evaluating dominant terms 

Our next goal is to prove a converse of a sort to Theorem 14.441 Strictly speaking, the results that 
follow do not provide a true converse, since branching is needed to construct an algorithm that 
evaluates a polynomial p accurately from algorithms that evaluate its dominant terms accurately. 

For the rest of this section, we make two assumptions, viz., that our polynomial p is homogeneous 
and irreducible. The latter assumption effectively reduces the problem to that of accurate evaluation 
of a nonnegative polynomial, due to the following lemma. 

Lemma 4.45. If a polynomial p is irreducible and has an allowable variety V{p), then it is either 
a constant multiple of a linear form that defines an allowable hyperplane or it does not change its 
sign in M". 

Proof. Suppose that p changes its sign. Then the sets {x : p{x) > 0} and {x : p{x) < 0} are 
both open, hence the part of the variety V{p) whose neighborhood contains points from both sets 
must have dimension n — 1. The only allowable sets of dimension n — 1 are allowable hyperplanes. 
Therefore, the linear polynomial that defines an allowable hyperplane must divide p. As p is 
assumed to be irreducible, p must be a constant multiple of the linear polynomial. 

Thus, unless p is a constant multiple of a linear factor of the type it must satisfy 

p{x) > for all X e or p{x) < for all x G M". □ 

From now on we therefore restrict ourselves to the nontrivial case when a (homogeneous and 
irreducible) polynomial p is nonnegative everywhere in M". 

Theorem 4.46. Let p be a homogeneous nonnegative polynomial whose variety V{p) is allowable. 
Suppose that all dominant terms pdom ,c for all components of the variety V{p), all standard changes 
of variables C and all subsets Kj satisfying 111^) are accurately evaluable. Then there exists a 
branching algorithm that evaluates p accurately owerM". 

Proof. We first show how to evaluate p accurately in a neighborhood of each irreducible component 
of its variety V{p). We next evaluate p accurately off these neighborhoods of V{p). The final algo- 
rithm will involve branching depending on which region the input belongs to, and the subsequent 
execution of the corresponding subroutine. We fix the relative accuracy rj that we want to achieve. 

Consider a particular irreducible component Vq of the variety V{p). Using any standard change 
of variables C, say, a basic change of variables associated with Vq, we map Vq to a set of the 
form xi = • • • = Xfc = 0. Our goal is to create an e-neighborhood of Vq where we can evaluate 
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p accurately. It will be built up from semialgebraic e-neighborhoods. We begin with any set S 
containing a neighborhood Vq, say, we let S coincide with [— 1, l]'^ x W^~^ after the change of 
variables C. We partition the cube [—1,1]*^ into sets Fa^ of type © as follows: We consider, as in 
S ect ion [4 ■ 3 ■ 2 1 ab ove . the Newton polytope P of p, form the intersection of the negative of its normal 
fan —N{P) with the nonnegative quadrant (in the new coordinate system), and determine 
the sets 5aj • If condition (|12|) fails for some of the sets Sp^. , we transform them using a suitable 
standard change of variables as described in Remark l4.4Ul so as to meet condition ()12() . For the rest 
of the argument, we can assume that all sets ^a, satisfy (fT2|) . 



3 

To form conic neighborhoods of one-dimensional rays oi-N{P)r\M.\ (which are normal to facets 
of the Newton polytope), we intersect —N{P) n with, say, the hyperplane xi + ■ ■ ■ + Xk = 1. 
Perform the Voronoi tesselation (see, e.g., [HHl) of the simplex xi + ■ ■ ■ + Xk = > 0, j = 1, . . . , /c 
relative to the intersection points of — A^(P)nM^ with the hyperplane xi + - ■ ■ + Xk = 1. Connecting 
each Voronoi cell of the tesselation to the origin xi = • • • 5;^ = by straight rays, we obtain cones 
S\. and the corresponding sets F{^. of type Q. Note that the Voronoi cells and therefore the cones 
S\ . are determined by rational inequalities since the tesselation centers have rational coordinates. 
Hence the sets Faj are semialgebraic, and moreover are determined by polynomial inequalities with 
integer coefficients. Indeed, even though the sets F\. are defined using logarithms, the resulting 
inequalities are among powers of absolute values of the variables and/or their sums and differences. 
For example, if a particular set F^^. is described by the requirement that (—log — log \x2\) lie 
between two lines through thw origin, with slopes 1/2 and 2/3, respectively, then this translates 
into the condition |x2|^ < \xi\^ < |x2p. 

Consider a particular "slice" F{^. and the dominant term Pdom.j,c- By Theorem 14.441 the dom- 
inant term Pdom ,c must be accurately evaluable everywhere. Hence, in particular, its variety 
V{pdomj,c) must be allowable. S ince the polynomial p vanishes on Vq , the dominant term pdomj ,c 
must vanish on Vq as well. So, there are two possibilities: V{pdomj,c) H F\. either coincides with 
Vo or is strictly larger. 

In the first case we apply Lemma 14.391 to show that we can evaluate p accurately in F\ . 
sufficiently close to Vq, as follows: Since the polynomial Pdom ,c is accurately evaluable everywhere, 
for any number r]j > there exists an algorithm with output Pdomj,c,comp such that \pdomj,c,comp — 
Pdomj,c\/\Pdomj,c\ < Vj everywhere. Next, by Lemma ESHl for any Sj > there exists £j > such 
that (fTU)) holds. Choose r]j and 6j so that 



Then we have 

Pdoruj ,C,comp P 



P 



< 



r/j(l -I- 6j) + 6j < rj. 

\Pdomj ,C,comp Pdomj ,c\ \Pdomj , c -P\ 
\P\ 



IPdorrij ,C,comp Pdomj,c\ \Pdomj ,c\ \Pdomi,C P\ ^ /-, c \ c- 

= —I 1 TT — + n ^ + "^i <V 

\Pdomj,c\ \p\ \p\ 

in the -neighborhood of Vq within the set F\.. Therefore, p can be evaluated by computing 
Pdomj,c,comp to accuracy r] in the ej-neighborhood of Vq within F\.. 

In the second case V(j>domj,c) has an irreducible component, say, Wq, that is strictly larger than 
Vq and intersects Fa, nontrivially. Since V{pdom ,c) is allowable, it follows from Definition 14.291 
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that there exists a standard change of variables Ci associated with Vq that maps Wq to a set 
xi = ■ ■ ■ = xi = 0, I < k. Use that change of variables and consider the new Newton poly tope and 
dominant terms. Since the polynomial p is positive in <S'\ Vq, there are terms in p that do not contain 
variables xi through xi. Therefore each new pdomr,Ci picks up some of those nonvanishing terms. 
Hence the (allowable) varieties V{p(iomr,Ci) have irreducible components strictly smaller than Wq 
(but still containing Vq). So, we can subdivide the set Fa^ further using the sets F^^ coming from 
the change of variables Ci and the resulting dominant terms will vanish on a set strictly smaller 
than Wq. In this fashion, we refine our subdivision repeatedly until we obtain a subdivision of the 
original set S into semialgebraic pieces (Sj) such that the associated dominant terms pdomj vanish 
in Sj only on Vq. Applying Lemma l4.39l in each such situation, we conclude that p can be evaluated 
accurately sufficiently close to Vq within each piece Sj. 

For each Vq, we therefore can find a collection (Sj) of semialgebraic sets, all determined by 
polynomial inequalities with integer coefficients, and the corresponding numbers £j, so that the 
polynomial p can be evaluated with accuracy r] in each ej -neighborhood of Vq within the piece 
Sj. Note that we can assume that each £j is a reciprocal of an integer, so that testing whether 
a particular point x is within £j of Vq within Sj can be done by branching based on polynomial 
inequalities with integer coefficients. 

The final algorithm will be organized as follows. Given an input x, determine by branching 
whether x is in Sj and within the corresponding £j of a component Vq. If that is the case, evaluate 
p{x) using the algorithm that is accurate in Sj in that neighborhood of Vq. For x not in any of 
the neighborhoods, evaluate p by Horner's rule. Since the polynomial p is strictly positive off the 
neighborhoods of the components of its variety, the reasoning of Section IHl applies, showing that the 
Horner's rule algorithm is accurate. If x is on the boundary of a set Sj, any applicable algorithm 
will do, since the inequalities we use are not strict. Thus the resulting algorithm for evaluating p 
will have accuracy rj as required. □ 

4.3.5 Obstacles to full induction 

The reasoning above suggests that there could be an inductive decision procedure that would allow 
us to determine whether or not a given polynomial is accurately evaluable by reducing the problem 
for the original polynomial p to the same problem for its dominant terms, then their dominant 
terms, and so forth, going all the way to monomials or other polynomials that are easy to analyze. 
However, this idea would only work if the dominant terms were somehow "simpler" than the original 
polynomial itself. This approach would require an induction variable that would decrease at each 
step. 

Two possible choices are the number of variables or the degree of the polynomial under con- 
sideration. Sometimes, however, neither of the two goes down, and moreover, the dominant term 
may even coincide with the polynomial itself. For example, if 

p{x) = A{x[3,n])xl + B{x[3,n])xiX2 + C(X[3:„])X^ 

where A, B, C are nonnegative polynomials in X3 through x^, then the only useful dominant term 
of p in the neighborhood of the set xi = 2:2 = is the polynomial p itself. Thus no progress 
whatsoever is made in this situation. 

Another possibility is induction on domains but we do not yet envision how to make this idea 
precise, since we do not know exactly when a given polynomial is accurately evaluable on a given 
domain. 
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Further work to establish a fuh decision procedure is therefore highly desirable. 

5 "Black-box" arithmetic 

In this section we prove a necessary condition (for both the real and the complex cases) for a more 
general type of arithmetic, which allows for "black-box" polynomial operations. We describe the 
type of operations below. 

Definition 5.1. We call a black-box operation any type of operation that takes a number of inputs 
(real or complex) xi, . . . ,Xk and produces an output q such that q is a polynomial in xi, . . . ,Xk- 

Example 5.2. q{xi, X2, xs) = xi + X2X3. 

Remark 5.3. Note that +, — , and ■ are all black-box operations. 

Consider a fixed set of multivariate polynomials {qj : j G J} with real or complex inputs 
(this set may be infinite). In our model under consideration, the arithmetic operations allowed 
are given by the black-box operations qi, . . . ,qk, and negation (which will be dealt with by way of 
dotted edges, as in Section With the exception of negation, which is exact, all the others yield 
a rnd{op{ai, . . . , ai)) = op{ai, . . . ,ai){l + 5), with \5\ < e (e here is the machine precision). All 
arithmetic operations have unit cost. We consider the same arithmetical models as in Section [2. 7| 
with this larger class of operations. 

5.1 Necessity: real and complex 

In order to see how the statement of the necessity Theorem 14.81 changes, we need to introduce a 
different notion of allowability. 

Recall that we denote by S the space of variables (which may be either M" or C"). From now 
on we will denote the set {1, . . . , n} by /C. 

Definition 5.4. Let p{xi, . . . , Xn) be a multivariate polynomial over S with variety V{p). Let 
ICz ^ /C, and let /C/), /C5 C /C x /C . Modify p as follows: impose conditions of the type Zi for each 
i £ ICz, and of type Dij, respectively Sij, on all pairs of variables in ICd, respectively ICs- Rewrite 
p subject to those conditions (e.g. set Xi = for all i G ICz), and denote it by p, and denote by ICr 
the set of remaining independent variables (use the convention which eliminates the second variable 
in each pair in ICd or ICs )■ 

Choose a set T C K.r, and let 

VT,Kz,ICD,ICsiP) =f^aV{qa) , 

where the polynomials qa are the coefficients of the expansion of p in the variables xt'- 

p{xi,...,Xk) = ^qaXT , 
a 

with qa being polynomials in X]q^\x only. 

Finally, let IC^ be a subset of ICr\T. We negate each variable in ICm , and let Vt,k.2,iCd,k.s,iCn 
be the variety obtained from Vt,icz,k.d,k:s{p)> '^'^^^ each variable in ICn negated. 
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Remark 5.5. V0 0(p) = V{p). We also note that, if we have a black-box computing p, then 
the set of all polynomials p that can be obtained from p by permuting, repeating, and negating the 
variables (as in the definition above) is exactly the set of all polynomials that can be evaluated with 
a single rounding error, using that black box. 

Definition 5.6. For simplicity, we denote a set {T,ICz,ICd,ICs,ICn) byi, and a set {T,ICz,K,D,K,s) 
by i+. 

Example 5.7. Let p(x,y,z) = x + y ■ z (the fused multiply- add). We record below all possibilities 
for 1 = (T,ICz,ICd,ICs,ICn), together with the obtained subvariety Vi{p). 

Without loss of generality, assume that we have eliminated all redundant or complicated condi- 
tions, like {x,y) £ K,d o,nd {x,y) G ICs (which immediately leads to x = y = 0, that is, x,y, S ICz. 
We assume thus that all variables not present in fCz cannot be deduced to be from conditions 
imposed by ICd or/and ICs. 

We obtain that all possibilities for Vi{p) are, up to a permutation of the variables, 

O {x = 0}, {x = l}, {x = -l}, 

O {x = 0}U{x = 1}, {x = 0}U{x = -1}, 

O {x = 0}U{y = 0}, {x = 0}U{y = 1}, {x = 0} U {y = -1}, 

O {x = — y^}, {x = y'^}, {x — y • z = 0}, and {x + y • z = 0}. 
Definition 5.8. We define g_2(xi, X2) = X1X2, X2) = xi + X2, and qo{xi,X2) = xi — X2. 

Remark 5.9. The sets 

1. Z, = {x : Xi = 0} , (14) 

2. Sij = {x : Xi + Xj = 0} , (15) 

3. Dij = {x : Xi — Xj = 0} , (16) 

and unions thereof, describe all non-trivial (neither nor S) sets of type V,, for q-2, Q-i, <ind qq. 

We will assume from now on that the black-box operations q-2, Q-i, go defined in 15.81 and some 
arbitrary extra operations qj, with j £ J {J may be infinite) are given and fixed. 

Definition 5.10. We call any set Vi{qj) with 1 = {T,JCz,ICd,ICs,ICn) o-s defined above and qj a 
black-box operation basic q-allowahle. 

We call any set R irreducible q-allowable if it is an irreducible component of a (finite) intersec- 
tion of basic q-allowable sets, i.e., when R is irreducible and 

where each Qi is a basic q-allowable set. 

We call any set Q q-allowable if it is a (finite) union of irreducible q-allowable sets, i.e. 

Q = 'JjRj , 

where each Rj is an irreducible q-allowable set. 

Any set R which is not q-allowable we call q-unallowable. 
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Remark 5.11. Note that the above definition of q- allowability is closed under taking union, in- 
tersection, and irreducible components. This parallels the definition of allowability for the classical 
arithmetic case - in the classical case, every allowable set was already irreducible (being an inter- 
section of hyperplanes). 

Once again, we need to build the setup to state and prove our new necessity condition. To do 
this, we will modify the statements of the definitions and the statements and proofs of the lemmas 
from Section [4.11 Since most proofs just follow in the footsteps of those from Section [4. 11 instead 
of repeating them, we will only point out the places where they differ and show how we modified 
them to work in the new context. 

Definition 5.12. Given a polynomial p with q-unallowable variety V{p), consider all sets W that 
are q-allowable (as in Definition \5.1(^) . and subtract from V{p) those W for which W C V{p). We 
call the remaining subset of the variety points in general position and denote it by Q{p). 

Remark 5.13. Since V{p) is q-unallowable, Q{p) is non-empty. 

Definition 5.14. Given x G S, define the set q^— Allow {x) as the intersection of all basic q-allowable 
sets going through x: 



for all possible choices ofT,ICz,ICD,fCs,ICN- 

The intersection in parentheses is S whenever x ^ Vi{qj) for all possible i. 

Remark 5.15. When x £ G{p), q— Allow(x) ^ Gip)- 

We can now state our necessity condition. 

Theorem 5.16. Given the black-box operations (7_2, go; f^^^^ {Qj ■ j £ J}> o-i^d the model 
of arithmetic described above, let p be a polynomial defined over a domain T) d S. Let Q{p) be 
the set of points in general position on the variety V{p). If there exists x G "D n Q{p) such that 
q— Allow(x) n Int(P) ^ 0, then p is not accurately evaluable on T>. 

We proceed to the construction of the elements of the proof of Theorem 15.161 The algorithm 
will once again be represented by a DAG with input nodes, branching nodes, and output nodes. 
As in Section^ for simplicity in dealing with negation (since negation is exact), we will work with 
solid edges, which convey a value unchanged, and dotted edges, which indicate that negation of the 
conveyed quantity has occurred. 

From now on, unless specified, we will consider only non-branching algorithms. 

We will continue to use the definition of a Zariski set (Definition 14. 9jl on a hypercube in (5-space, 
and we work with the same definition of a non-trivial computational node (recalled below). 

Definition 5.17. For a given x £ S, we say that a computational node N is of (7-non-trivial type 
if its output is a nonconstant polynomial of 6. 

Recall the notation from Definition 14.121 

The equivalent of Proposition I4.1,''{1 becomes the following. 



q-Allow(2;) := njgju{-2 -1,0} i^i xeKfe) K(gj) 
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Proposition 5.18. Given any algorithm, any e > 0, and a point x in Q{p), there exists a Zariski 
open set A C such that no q-non-trivial computational node has a zero output on the input x 
for all G A. 

Proof. The proof of Proposition 15.181 follows the same path as that of Proposition 14.11^1 To each 
g-non-trivial node corresponds a Zariski open set in (5-space; there is a finite number of them, and 
their intersection provides us with the Zariski open set we are looking for. □ 

We will now state and sketch the proof of the equivalent of Lemma 14.141 

Lemma 5.19. For a given algorithm, x G Q{p), and e > 0, exactly one of the following holds: 

1. there exists a Zariski open set A £ such that the value Pcomp{x,5) computed by the algo- 
rithm is not zero when the algorithm is run with source input x and errors 5 G A; 

^- Vcompiy-, 5) = for all y G q— Allow(x) and all 6 £ H^. 

Proof of Lemma \5.1{A Give x G G{p), choose 5 from the Zariski open set A whose existence is given 
by Proposition 15.181 Either the output node is of g- non-trivial type (in which case Pcompi^: ') ^ 
and we are done) or the output is a nonzero constant polynomial in 5 (and again we are done) or 
the output is the zero polynomial in 6. In this latter case, we trace back all the zeros again, as in 
the proof of Lemma 14.141 and get a set of paths of nodes that produced all 0. 

Let us start from the last nodes on these paths and work our way up, level after level. The last 
node on such a path is either a source, or a node with all inputs from sources, or a node with at 
least one input which is not a source and not (and hence a polynomial in 6). In the former two 
cases, we have traced back the zeros to basic g-allowable conditions. We will show that this is also 
true for the latter case. 

Lemma 5.20. If the last zero occurs at a node which computes a black-box operation qj and which 
has some source inputs and some (nontrivial) polynomial inputs, then the sources lie in some Vi{qj) 
(and this constraint causes the zero output of this node). 

Proof of Lemma \5.2(A Label the node (assume it corresponds to the black-box qj) with output 
0; then some inputs are sources, and some are polynomials of 6 (since this was the last node on 
the path, it has no zero inputs). By the choice of 5, it follows that the output has to be the 
polynomial in 6. 

Some of the non-source inputs to N might come from the same nodes; assume you have a total 
of / distinct nodes which input to N (nonconstant) polynomials of 6, and let Ii{6), l2{S), . . ., Ii{6) 
denote these inputs. We will need the following lemma. 

Lemma 5.21. Since the DAG is acyclic, Ii{6), l2{6), . . . are algebraically independent poly- 

nomials in 5. 

Proof of Lemma \5.21\ Suppose there is some polynomial dependence among /i((5), . . ., Let 
A^i, . . ., Ni be the nodes which have computed /i((5), . . . , and let 81,82, ... ,6i be the specific 
8s at these nodes. Let Di, . . . ,Di be the set of 8s present (non-trivially) in the output of each node, 
e.g., 8i G Di. At least one 8i is not present in Ui^jDj; otherwise we get a cycle. But then Ii{8) is 
algebraically independent from the other inputs, i.e., there is some dependence among the inputs 
Ij{8) with j 7^ i. We use induction on the number of remaining inputs, and exclude one input at a 
time, until we're left with a contradiction. This proves Lemma 15.211 □ 
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Replace each Ii{5) (or —Ii{5)) in Y by the same dummy variable Zi (respectively —Zi), for each 
i G {1, ...,/}. The variables Zi are algebraically independent by Lemma 15.211 

Denote by z the new vector of inputs to (both values and variables). The value qj{z) = 0, 
regardless of the ZiS (since it was regardless of the 6s and the Zi are algebraically independent 
variables). It follows that the constraints which place z on the variety of qj are twofold: they 
come from constraints of the type Dij and Sij which describe the places where we inputted the 
values Zi, and they come from imposing conditions on the other inputs, which are sources. Thus the 
constraints on z are of the form VTfi,KD,lCs,>CN ~- This concludes the proof of LemmaESOl □ 

Now that we have shown that the last marked vertices all provide basic g-allowable conditions, 
we proceed by induction: we look at a "next" marked vertex (here "next" means that all its marked 
ancestors have been examined already). It has some zero inputs, some source inputs, some of the 
inputs satisfy constraints of type Dij and Sij, and some of the inputs are polynomial. Prom here 
on we proceed as in Lemma I5.2U1 and obtain a set of new constraints to be imposed on the sources, 
of the type Vi{qj), which we will intersect with the rest of the constraints obtained so far. 

At the end of the examinations, we have found a set of basic g-allowable constraints which 
the sources must satisfy, i.e., a list of basic g-allowable sets with the property that the sources lie 
in their intersection; the fact that the sources satisfy these constraints is responsible for the zero 
output at the end of the computation. 

It is not hard to see that in this case, once again, it follows that for all y in q— Allow(x) and 
any 6 £ A, the output is (just as in Lemma l4. 14(1 . Thus we have proved Lemma 15.191 □ 

From Lemma 15.191 we obtain the following corollary. 

Corollary 5.22. For any algorithm, for any e > 0, and any x G G{p), exactly one of the following 
holds: the relative error of computation, \pcomp ~ p\/\p\; is either infinity at x for all 6 in a Zariski 
open set or 1 at all points y £ (q— Allow(x) \ V{p)) and all 6 £ H^. 

We will now consider algorithms with or without branches. 

Theorem 5.23. Given a (branching or non- branching) algorithm with output function Pcomp{-), 
X G Q{p), and e > 0, then one of the following is true: 

1. there exists a set Ai G of positive measure such that pcomp{x,6) is nonzero whenever the 
algorithm is run with errors 5 G Ai, or 

2. there exists a set A2 G H^: of positive measure such that for every 6 G A2, there exists a 
neighborhood Ns{x) of x such that for every y G A''5(a;) n (q — Allow(a;) \ V{p)), Pcomp{y,6) = 
when the algorithm is run with errors 6. 

Remark 5.24. Just as before, this implies that, on a set of positive measure in H^, the relative 
accuracy of any given algorithm is either 00 or 1 . 

Proof. The proof is essentially the same as in Theorem 14.171 the only thing that needs to be 
examined is the existence in q— Allow(rE) \ V{p) of an infinite sequence {j/n} with y„ x. 

We will make use of the following basic result in the theory of algebraic varieties, which can for 
example be found as Theorem 1 in [581 Section 6.1]. 

Result 5.24.1. If X and Y are polynomial varieties such that X CY , then dim{X) < dim{Y). 
IfY is irreducible and X CY is a (closed) subvariety with dim{X) = dim{Y), then X = Y. 
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We write q— Allow(x) as a union of irreducible g-allowable components. By the way we defined 
G{p), it follows that none of these components is included in V{p); by E,esult lB.24.H it follows that 
the intersection of any irreducible g-allowable component P of q— Allow(a;) with V{p) has a smaller 
dimension than P. 

Choose the (unique) irreducible component P that contains x; this component must have di- 
mension at least 1 (since if it contained only x, the set {x} would be g-allowable, and hence we 
would have extracted it from V{p), which is a contradiction with the fact that x G Gip))- Since 
P \ V{p) has a smaller dimension than P, there must be some infinite sequence {y-n} in P \ V{p), 
i.e. in q— Allow(x) \ V{p), such that y„ x. 

The rest of the argument goes through just as in Theorem 14.171 □ 

Finally, as in Section [4. 11 we have a corollary. 

Corollary 5.25. Letp be a polynomial over S with unallowable variety V{p). Given any algorithm 
with output function Pcomp{')j o, point x G Q{p), e > 0, and i] < 1, there exists a set of positive 
measure arbitrarily close to x and a set A G of positive measure, such that \pcomp — p\/\p\ is 
strictly larger than 7] when computed at a point y £ A^ with errors 5 £ A. 

The proof is based on the topology of S, and is identical to the proof of Corollary 14.191 we 
choose not to repeat it. 

Proof of Theorem \5.16[ Follows immediately from Theorem 15.231 and Corollary 15.251 □ 
5.2 Sufficiency: the complex case 

In this section we obtain a sufficiency condition for the accurate evaluability of a complex polyno- 
mial, given a black-box arithmetic with operations g_2, q-i, qo and {qj\j G J} {J may be an infinite 
set). 

Throughout this section, we assume our black-box operations include q^, which consists of 
multiplication by a complex constant: q^{x) = c ■ x. Note that this operation is natural, and that 
most computers perform it with relative accuracy. 

We believe that the sufficiency condition we obtain here is sub-optimal in general, but it 
subsumes the sufficiency condition we found for the basic complex case with classical arithmetic 
{+,-,•}• 

We assume that the black-box polynomials defining the operations qj with j £ J are irreducible. 

Lemma 5.26. The varieties Vi{qj) are irreducible for any j £ J and any i as in Definition \5.4\ if 
and only if all qj, j £ J are affine polynomials. 

Proof. If qj is an affine polynomial then any Vi{qj) is also affine, hence irreducible over C". Con- 
versely, if qj is not an affine polynomial, then by inputting a single value x for all the variables, we 
obtain a one- variable polynomial of degree at least 2, which is necessarily reducible over C". □ 

We state here the best sufficiency condition for the accurate evaluability of a polynomial we 
were able to find in the general case, and a necessary and sufficient condition for the all-affine 
black-box operations case. 
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Theorem 5.27 (General case). Given a polynomial p : C" C with V{p) a finite union 
of intersections of hyperplanes Zi, Sij, Dij, and varieties V{qj), for j G J, then p is accurately 
evaluable. 

Theorem 5.28 (AfRne case). If all black-box operations qj, j € J are affine, then a polynomial 
p : C" — > C is accurately evaluable iff V {p) is a union of intersections of hyperplanes Zi, Sij, Dij, 
and varieties Vi{qj), for j £ J and i as in Definition \5.4\ 

We will begin by proving Theorem 15.271 We will once again make use of Theorem 14.221 and of 
Theorem EMU 

Lemma 5.29. IfV{p) is as in Theorem \5.21\ then V{p) is a simple finite union of hyperplanes 
Zi, Sij, Dij and varieties V {qj) ('with no intersections 

Proof. Indeed, if that were not the case, then some irreducible g-allowable component P of V{p) 
would be an intersection of two or more sets described in Theorem 15. 271 If P were contained in the 
intersection of two or more (distinct) hyperplanes, its dimension would be smaller than n — 2, and 
we would get a contradiction to Theorem 14.221 

Suppose now that P was contained in the intersection of a y{qj) with some other variety or 
hyperplane. All such varieties, by Theorem 14.221 must have dimension n — 1, and since all such 
varieties and hyperplanes are irreducible, by E,esult 1^.24.11 their intersection must have dimension 
strictly smaller than n — 2. Contradiction; we have thus proved that the variety V{p) is a simple 
union of hyperplanes Zi, Dij, Sij, and varieties V{qj). □ 

Corollary 5.30. // p : C" C is a polynomial whose variety V{p) is q-allowable, then it is a 
product p = cYljPj, where each pj is a power of Xi, (xj — Xj), {xi + Xj), or qj, and c is a complex 
constant. 

Proof. By Lemma 15.291 the variety V{p) is a union of basic g-allowable hyperplanes and varieties 
V{qj). 

Choose an irreducible g-allowable set in the union. If this set is a hyperplane, then by following 
the same argument as in Corollarv 14.24( we obtain that p factors into some p and some power of 
either Xi, {xi — Xj), or (xj + Xj). 

Suppose now that the irreducible g-allowable set were a variety V{qj); since p is whenever qj 
is and qj is irreducible, it follows that qj divides p. We factor then p into the largest power of qj 
which divides it, and some other polynomial p. 

In either of the two cases, we proceed by factoring p in the same fashion, until we encounter a 
polynomial p of degree 0. That polynomial is the constant c. □ 

Proof of Theorem \5.27l By Corollarv 14.241 p = c Y\j pj, with each pj a power of Xk, {xk ± x/), or qi. 

Since each of the factors is accurately evaluable, the algorithm that forms their product evaluates 
p accurately. Multiplication by c (corresponding to the black-box q'^) is also accurate, hence p is 
accurately evaluable. □ 

The proof of Theorem 15.281 follows the path described above; we sketch it here. 

Proof of Theorem 15.281 The key fact in obtaining this condition is the irreducibility of all sets 
Vi{qj), which is guaranteed by Lemma 15.261 together with the result of Lemma 15.291 Once again, 
we can write the polynomial as a product of powers of Xi, {xi it Xj), or V^^qj), times a constant; 
this takes care of the sufficiency part, while the necessity follows from Theorem 15.161 □ 
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Remark 5.31. Note that Theorem \5.28\ is a more general necessary and sufficient condition than 
Theorem \4-21\ which only considered having g_2,(?-i, and qo as operations, and restricted the 
polynomials to have integer coefficients (thus eliminating the need for q'^). 



6 Accurate Linear Algebra in Rounded Arithmetic 

Now we describe implications of our results to the question of whether we can accurately do 
numerical linear algebra on structured matrices. By a structured matrix we mean a family of n- 
by-n matrices M whose entries Mij(x) are simple polynomial or rational functions of parameters 
X. Typically there are only 0{n) parameters, and the polynomials Mij{x) are closely related (for 
otherwise little can be said). Typical examples include Cauchy matrices {Mij{x,y) = l/{xi + yj)), 
Vandermonde matrices (Mij(x) = generalized Vandermonde matrices {Mij{x) = x^ ^ , 




where the Xj are a nondecreasing sequence of nonnegative integers), Toeplitz matrices {Mij{x) = 
Xi-j), totally positive matrices (where M is expressed as a product of simple nonnegative bidiagonal 
matrices arising from its Neville factorization), acyclic matrices, suitably discretized elliptic partial 
differential operations, and so on |2ll ESI Eli 123 EOl EEl EHl EH 112 EHl • 

It has been recently shown that all the matrices on the above list (except Toeplitz and non- 
totally-positive generalized Vandermonde matrices) admit accurate algorithms in rounded arith- 
metic for many or all of the problems of numerical linear algebra: 

• computing the determinant 

• computing all the minors 

• computing the inverse 

• computing the triangular factorization from Gaussian elimination, with various kinds of piv- 
oting 

• computing eigenvalues 

• computing singular values 

We have gathered together these results in Table ^ 



The proliferation of these accurate algorithms for some but not all matrix structures motivates 
us to ask for which structures they exist. 

To convert this to a question about polynomials, we begin by noting that being able to compute 
the determinant accurately is a necessary condition for most of the above computations. For 
example, if the diagonal entries of a triangular factorization of A, or its eigenvalues, are computable 
with small relative error, then so is their product, the determinant. 

It is also true that being able to compute all the minors accurately is a sufficient condition for 
many of the above computations. For the inverse and triangular factorization, this follows from 
Cramer's rule and Sylvester's theorem, resp., and for the singular values an algorithm is described 
in UllEl- 
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Table 1: General Structured Matrices 
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Thus, if the determinants Pn{x) = detM"'^"(x) of a class of n-by-n structured matrices M do 
not satisfy the necessary conditions described in Theorem 15.161 for any enumerable set of black- 
box operations (perhaps with other properties, like bounded degree), then we can conclude that 
accurate algorithms of the sort described in the above citations are impossible. 

In particular, to satisfy these necessary conditions would require that the varieties V{pn) be 
allowable (or g-allowable) . For example, if 1/ is a Vandermonde matrix, then det(y) = Yli^jixi—xj) 
satisfies this condition, using only subtraction and multiplication. 

The following theorem states a condition which guarantees the impossibility of an algorithm 
using any enumerable set of black-box operations of bounded degree: 

Theorem 6.1. Let M{x) be an n-by-n structured complex matrix with determinant Pn{x) as de- 
scribed above. Suppose Pn{x) has an irreducible factor pn{x) whose degree goes to infinity as n goes 
to infinity. Then for any enumerable set of black-box arithmetic operations of bounded degree, for 
sufficiently large n it is impossible to accurately evaluate Pn{x) over the complex numbers. 

Proof. Let qi, ....,qm be any finite set of black-box operations. To obtain a contradiction, suppose 
the complex variety V{pn) satisfies the necessary conditions of Theorem 1^.161 i.e., that V{pn) Is 
allowable. This means that V{pn)^ which includes the hypersurface V{pn) as an irreducible com- 
ponent, can be written as the union of irreducible g'-allowable sets (by Def. I5.1(T|) . This means that 
V{pn) must itself be equal to an irreducible g-allowable set (a hypersurface), since representations 
as unions of irreducible sets are unique. The irreducible (^-allowable sets of codimension 1 are de- 
fined by single irreducible polynomials, which are in turn derived by the process of setting variables 
equal to one another, to one another's negation, or zero (as described in Defs . WT^ and l5?T(l|) . and 
so have bounded degree. This contradicts the unboundedness of the degree of V{pn)- Q 

In the next theorem we apply this result to the set of complex Toeplitz matrices. We use the 
following notation. Let T be an n-by-n Toeplitz matrix, with Xj on the j-th diagonal, so xq is on 
the main diagonal, Xn-i is in the top right corner, and xi-n is in the bottom left corner. 
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Theorem 6.2. The determinant of a Toeplitz matrix T is irreducible over any field. 

Corollary 6.3. The determinants of the set of complex Toeplitz matrices cannot be evaluated 
accurately using any enumerable set of bounded- degree black-box operations. 

Proof of Theorem \6.SA We use mduction on n. We note that det T depends on every variable xj, 
because det T includes the monomials ±x^~-' x-j_^ for j > 0, as well as Xq, and these monomials 
contain the maximum powers of xj and Xj-n appearing in the determinant. Now Xn-i appears 
exactly once in T, so det T must be an affine function of say det T = x„_i • pin + P2n- 

By expanding det T along the first row or column, we see that pin is itself the determinant of a 
Toeplitz matrix with diagonals ...,Xn-3, and p2n depends on ...,Xn-2 but not If 

det T = Xn-i ■ Pin +P2n Were reducible, its factorization would have to look like Xn-i ■ Pin +P2n = 
{xn-i • Psn + P4n)P5n, where all the subscripted p polynomials are independent of x^-i, implying 
either that pin = PznP^n were reducible, a contradiction by our induction hypothesis, or p^n = 1 
and sopin\p2n- Now we can write = xl^-2lin + Xn-2<l2n + <li,n where gi„ / 0, since det T includes 
the monomial ±x^_2x"2^ ^^^d no higher powers of Xn-2- Furthermore qin is independent of x^-i, 
Xn-2 and x„_3, and q2n and are independent of Xn-i and x„_2- Since pin is independent of 
Xn-2, the only way we could have pin\p2n is to have pin\qin, Pin\q2n, and pinlqsn- But since pin 
depends on Xns and qin is independent of Xns, this is a contradiction. So the determinant of a 
Toeplitz matrix must be irreducible. □ 

In the real case, irreducibility of pn is not enough to conclude that p„ cannot be evaluated 
accurately, because MR(Pn) niay still be allowable (and even vanish). So we consider another 
necessary condition for allowability: Since all black-boxes have a finite number of arguments, their 
associated codimension-1 irreducible components must have the property that whether x £ Vi^qj) 
depends on only a finite number of components of x. Thus to prove that the hypersurface Vu{pn) 
is not allowable, it suffices to find at least one regular point x* in V^{pn) such that the tangent 
hyperplane at x* is not parallel to sufficiently many coordinate directions, i.e., membership in 
depends on more variables than any Vi{qj). This is easy to do for real Toeplitz matrices. 

Theorem 6.4. Let V be the variety of the determinant of real singular Toeplitz matrices. Then 
V has codimension 1, and at almost all regular points, its tangent hyperplane is parallel to no 
coordinate directions. 

Corollary 6.5. The determinants of the set of real Toeplitz matrices cannot be evaluated accurately 
using any enumerable set of bounded- degree black-box operations. 

Proof of Theorem \6.4\ Let Toep{i, j) denote the Toeplitz matrix with diagonal entries Xi through xj; 
thus Toep{i,j) has dimension (j — i)/2 + 1. Let U be the Zariski open set where det Toep{i,j) ^ 
for alll — n<i<j<n — 1 and j — i even. Then det T is a nonconstant affine function of 
and so for any choice of xi-n, ...,x„_2 in U, det T is zero for a unique choice of a;„„i. This shows 
that Mr (det T) has real codimension 1. 

Furthermore, det T has highest order term in each Xi, < i < n — 1, equal to zizToep{l — 
n,2i — n — i.e., with nonzero coefficient on U. It also has the highest order term in each Xj, 

1 — n < i < 0, equal to ±Toep{n + 2i + l,n — l)x"^*, i.e., with nonzero coefficient on U. Finally, 
the highest order term in xq is Xq, with coefficient 1. Thus the gradient of det T has all nonzero 
components on a Zariski open set, and whether det T = depends on all variables. □ 
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6.1 Vandermonde matrices and generalizations 

In this section we will explain the entries filled "No" in Tabled First we will show that polynomial 
Vandermonde matrices do not have algorithms for computing accurate inverses, by proving that 
certain minors needed in the expression cannot be accurately computed (this will also explain the 
"No" in the Any minor column). Finally, we will show that the LDU factorization for polynomial 
Vandermonde matrices cannot be computed accurately. 

First we consider the class of generalized Vandermonde matrices V, where Vij = Pj-i{xi) is a 
polynomial function of Xj, with 1 < i, j < n. This class includes the standard Vandermonde (where 
Pj-i{xi) = and many others. 

Consider a generalized Vandermonde matrix where Pj-i{xi) = x^"^"'"'*'' with < Ai < A2 < 
■ ■ ■ ^ A„. The tuple A = (Ai, A2, A„) is called a partition. Any square submatrix of such a 
generalized Vandermonde matrix is also a generalized Vandermonde matrix. A generalized Van- 
dermonde matrix is known to have determinant of the form s\{x) Y\i^j{xi — Xj) where s\{x) is a 
polynomial of degree |A| = X^jAj, and called a Schur function In infinitely many variables 

(not our situation) the Schur function is irreducible |32j . but in finitely many variables, the Schur 
function is sometimes irreducible and sometimes not |611 Exer. 7.30]. But there are irreducible 
Schur functions of arbitrarily high degree. Thus we conclude by Theorem 16. II that no enumerable 
set of black-box operations of bounded degree can compute all Schur functions accurately when the 
Xi are complex. 

If we restrict the domain D to be nonnegative real numbers, then the situation changes: The 
nonnegativity of the coefficients of the Schur functions shows that they are positive in T>, and in- 
deed the generalized Vandermonde matrix is totally positive j39j . Combined with the homogeneity 
of the Schur function. Theorem 13.21 implies that the Schur function, and so determinants (and mi- 
nors) of totally positive generalized Vandermonde matrices can be evaluated accurately in classical 
arithmetic. For accurate algorithms that are more efficient than the one in Theorem 13.21 see j29j . 

Now consider a polynomial Vandermonde matrix Vp defined by a family {Pk{x)}k^fq of poly- 
nomials such that deg(Pfc) = k, and Vp{i,j) = Pj-i{xi). 

Note that any Vp can be written as Vp = VC, with V being a regular Vandermonde matrix, 
and C being an upper triangular matrix of coefficients of the polynomials Pk, i.e., 

j 

P,_i(x) = Y^C{i,j)x'-^ , VI < i < n . 

i=l 

Denote by Cj_i := D{i,i), for all 1 < i < n the highest-order coefficients of the polynomials 
Po{x), . . .,Pn-i{x). 

To compute the inverse of the matrix Vp, we need to compute the minors that result from 
deleting a row and a column, i.e., (in MATLAB notation) det(Vp([l :z— 1, z+1 :n], 1, j+l:n]). 
We will focus our attention on the computation of the {i,n — 1) minors, i.e., the ones that result 
from deleting any of the rows and the (n — l)st column. 
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The resulting matrices look like 



Mp^i ■.= Vp{[l:i-l,i+l:n],[l:n-2,n]) 



Co -Pi(xi) 

Co Pl{Xi-i) 

Co Pl{xi+i) 

Co Pl{Xn) 



. Pn-3{xi) Pn-lixi) 

. Pn-siXi-l) Pn-liXi-l) 
. Pn-3{Xi+l) 



Pn—3 i^n) 



-fn— 1 (^n) 



Hence, we can manipulate the columns of det(Mp^j) by subtracting from them linear combinations 
of other columns, to obtain 



det (Mp,i 



Co CiXi 



C2xl 



„n-2 



Co CiXi_i C2Xj_i 
Co CiXj+i C2xLi 



Cn-3xl Cn-ixl ^ +C{n- 1, n)xl 



Cn-3X^_l Cn^lX^_l + C{n - 1, n)x^ 2 



-ix'l'l + C{n- l,n)x 



i-1 
n-2 

i+l 



Co CiXr, 



C2X2 



Cn-3Xn ^ Cn-ixl ^ + C{n- l,n)xl ^ 



By expanding on the last column, and using the results from we obtain that there are 



constants E and F, specifically E = C{n — \,n) HILi^ '-i-i ^'^'^ ^ ~ ^ cin-in) ' ^^^^ that 



det (Mp,i ) = JJ {xj - Xk) [E + F ■ s\^^{xi, . . . ,Xi-i,Xi+i, 

k<j 
k,j 7^ i 



with S[i] being the Schur function corresponding to the partition A = (1, 0, . . . , 0), i.e. 



det(Mp,i) = Y[ i^j - Xk) [E + F ■ {xi + . . . + + Xj+i 
k<j 



+ Xn)] 



and it is not hard to see that for any ^ 4, the above polynomial in xi, . . . , x^— i, x^-i-i, • • • , x^ 
does not have an allowable variety, and hence the inverse cannot be evaluated accurately in classical 
arithmetic. 

Denote x^i := (xi, . . . ,Xi-i,Xi+i, . . . ,Xn)- 

Similarly, one can prove that the — A;) minor det Mpi k can be obtained as 



det(Mj 



P,i,kj 



JJ {xj-xk) [Ai + ^2S[i](a;^i) + ■ • • + ^fcS[ife](x^i)] , 

k < i//k, k / i 



where [l'] = (1, 1, 1, . . . , 0), the right side containing exactly I ones and the rest 0; Ai, . . . ,Ak are 
constants which can be computed easily in terms of the entries of the matrix C. 

Since for any I, sj^ij is a homogeneous irreducible function of degree the factor in the square 
brackets has degree k. Appropriate choices of n and the matrix C are likely to make this factor 
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irreducible (for example, by making \Ak\ ^ \Ai\ for all / 7^ k). If this is the case, then by 
Theorem 16.11 this family of matrices has inverses that cannot be evaluated accurately even with 
the addition of any enumerable set of bounded-degree black-boxes. 

This explains why we have filled in with "No" the entries corresponding to columns and 
"Any minor^^ in the Polynomial Vandermonde row of Table ^ Below we explain the "No" in the 
Polynomial Vandermonde row, in the column "LDC/". 

We can write C = DC, with D being the diagonal matrix of highest-order coefficients, i.e., 
D{i,i) = C{i,i) for all 1 < i < n. We will assume that the matrices C and D are given to us 
exactly. 

If we let Vp = LpDpUp and V = LDU, it follows that 

Lp = L- 
Dp = DD ; 
Up = D~^UC . 

Since we cannot compute L accurately in the general Vandermonde case, it follows that we 
cannot compute Lp accurately in the polynomial Vandermonde case. 

Finally, we explain the "*" entries in the polynomial Vandermonde row. These depend on 
special properties of the polynomial. In general, neither the SVD nor the symmetric eigenvalue 
decomposition (EVD) are computable accurately, but if the polynomials are certain orthogonal 
polynomials, then the accurate SVD is possible |2H]) and an accurate symmetric EVD may also be 
possible [HO] . 
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